Question: Log-transformed RNA-Seq data and linear regression
0
Na Sed280 wrote:

I would like to investigate about the relation of RNA-Seq data together, i.e. making gene networks.

Can I use linear regression model after log-transformation of  data, i.e. log(read_count+1)?

Actually, I have done it and the results appear meaningful, but I am doubted about the process.

Appreciate for any thought.

modified 4.7 years ago by Irsan7.1k • written 4.7 years ago by Na Sed280
2
Steven Lakin1.4k wrote:

Liner regression models how well a set of points approximate to a line. You can certainly use it on anything; your interpretation of what the trend means is the more meaningful part. Just remember that if you find a linear correlation in your data, it actually is an exponential one.  Also understand the strengths and weaknesses of the model you're using (look at your residuals, understand how you can interpret goodness of fit, etc.).

For example:

Incremental doses of drug Y vs. log2(fold change) of transcript X

R^2 = 0.99

Might suggest that incremental doses of drug Y correlate with a base 2 exponential response (y = 2^x) in gene expression, not a linear one.

2

Agreed, but note that when you talk about an exponential gene expression, it's really an exponential read-count relationship. We don't know the real relationship between readcount and expression. Some people think log(reads) is proportional to expression already.

Yes I do think so as well (that log(reads) is proportional to expression). This is because the original RNA molecules are amplifified in an exponential fashion (PCR) before they are measured by microarray/RNAseq/Taqman/any other assay that measures PCR-amplified RNA

I think log transformed count indeed maintains the original proportion, no? However for the purpose of the linear model, would log transformed count cause non-linearity?

Would log-transformed value violate the linearity assumption required for linear regression? Are independent variables and dependent variables still linear correlated after transformation?

2
Irsan7.1k wrote:

Use voom-transformation of RNAseq count data to use them in linear models (like in limma). How To Transform From Rna-Seq Deseq To Limma Voom() And Makecontrastsexplains how.

1
Zhilong Jia1.5k wrote:

The log is just kind of preprocess. if your results are biologically fine, I believe it's reasonable. Maybe you can check limma for RNA-seq as well.