I have carried out gene pairwise correlations between every pair of genes in my RNAseq data.
I have done this with non-log transformed normalised counts generated from
estimatesizefactors in DESeq2, as well as with VST counts and log2 transformed counts.
All pairwise correlations make sense and I am inclined to simply use non-log transformed normalised counts.
Is there a specific reason using non-log transformed normalised counts from
estimatesizefactors would be detrimental compared to using log2/vst transformed counts? The only thing i can think of is that for extremely highly expressed genes in one sample, the mean will shift if that gene is much more lowly expressed in other samples and this variance would be less pronounced after variance stabilisation...
Any thoughts are much appreciated