Question

RNA seq normalization and gene gene correlation

0

Entering edit mode

17 days ago

Fish • 0

Hello, I'm new to RNA seq data and would like to seek your help in the following.

My goal is to analyze gene-gene co-expression using Pearson correlation between two sample groups (normal vs disease). For example, I want to see if there is a higher correlation between Gene A and Gene B in normal vs disease.

Now my question is:

Which normalization method should I use to reach this goal? I see some forums mentioned VST but some also discourage this. What is your take, and why?
If VST works, should I remove genes with low counts before or after VST normalization?

Thank you for your answers in advance!

rna-seq pearson-correlation vst • 251 views

ADD COMMENT • link updated 17 days ago by Ram 43k • written 17 days ago by Fish • 0

1

Entering edit mode

I see some forums mentioned VST but some also discourage this.

Please provide links for these iscussions. I bet they are in the context of "using VST in the context of differential testing". Here one should use the raw counts when using DESeq2 or edgeR.

The idea behind VST is to mitigate the effects of variance, basically VST enables comparison of genes that demonstrate different levels of variance (see the quote below from the DESeq2 tutorial). I would use VST for your problem. Having said that, why reinventing the wheel and not using one of the existing methods/packages?

The point of these two transformations, the VST and the rlog, is to remove the dependence of the variance on the mean, particularly the high variance of the logarithm of count data when the mean is low.

ADD REPLY • link 17 days ago by Haci ▴ 680