Log-transformed RNA-Seq data and linear regression
3
0
Entering edit mode
9.1 years ago
Na Sed ▴ 310

I would like to investigate about the relation of RNA-Seq data together, i.e. making gene networks.

Can I use linear regression model after log-transformation of data, i.e. log(read_count+1)?

Actually, I have done it and the results appear meaningful, but I am doubted about the process.

Appreciate for any thought.

Linear-Regression RNA-Seq R Log-transform • 7.0k views
2
Entering edit mode
9.1 years ago
Steven Lakin ★ 1.8k

Liner regression models how well a set of points approximate to a line. You can certainly use it on anything; your interpretation of what the trend means is the more meaningful part. Just remember that if you find a linear correlation in your data, it actually is an exponential one. Also understand the strengths and weaknesses of the model you're using (look at your residuals, understand how you can interpret goodness of fit, etc.).

For example:

Incremental doses of drug Y vs. log2(fold change) of transcript X

R^2 = 0.99

Might suggest that incremental doses of drug Y correlate with a base 2 exponential response (y = 2^x) in gene expression, not a linear one.

2
Entering edit mode

Agreed, but note that when you talk about an exponential gene expression, it's really an exponential read-count relationship. We don't know the real relationship between readcount and expression. Some people think log(reads) is proportional to expression already.

0
Entering edit mode

Yes I do think so as well (that log(reads) is proportional to expression). This is because the original RNA molecules are amplifified in an exponential fashion (PCR) before they are measured by microarray/RNAseq/Taqman/any other assay that measures PCR-amplified RNA

0
Entering edit mode

I think log transformed count indeed maintains the original proportion, no? However for the purpose of the linear model, would log transformed count cause non-linearity?

0
Entering edit mode

Would log-transformed value violate the linearity assumption required for linear regression? Are independent variables and dependent variables still linear correlated after transformation?

2
Entering edit mode
9.1 years ago
Irsan ★ 7.8k

Use voom-transformation of RNAseq count data to use them in linear models (like in limma). How To Transform From Rna-Seq Deseq To Limma Voom() And Makecontrastsexplains how.

1
Entering edit mode
9.1 years ago
Zhilong Jia ★ 2.2k

The log is just kind of preprocess. if your results are biologically fine, I believe it's reasonable. Maybe you can check limma for RNA-seq as well.