Removing variation identified by principal components in RNA-seq
1
0
Entering edit mode
3.6 years ago
sysboolean ▴ 90

Hi all,

I am a beginner with RNA-seq analysis, and using DESeq2. The experiment design is: cells from 4 subjects were cultured and then treated with a small molecule. I wish to perform DE between the control & treated conditions, while the subjects would be replicates. The formula I am using is ~ subject + treatment.

When I plot the principal components PC1 vs PC2, the samples separate by subject in PC1. Similar trend in seen upto PC3. However, when I plot PC1 vs PC4, I can see that PC4 separates the samples by treatment.

How do I regress out PC1 (or the subject) from the data so that I can get DE for treatment ? Also, only 7% of the variation in data is explained by PC4 (which separates the samples by treatment). Is there a metric for how much I can trust the results from this analysis ? Thank you and I apologize if this question has asked before. I don't know what terms I should be searching for.

subjectinf

edit: Thanks genomax for pointing out image upload.

RNA-Seq DESeq2 • 1.7k views
ADD COMMENT
3
Entering edit mode
3.6 years ago

When using DESEq, you don't regress out effects like that. You include subject as an element in your design, and the software will make its model taking subject into account.

You can use limma batch effects remove to generated 'corrected' counts, which you can use for visualization, but you don't use this as input to DESeq.

ADD COMMENT
0
Entering edit mode

Thanks for the reply. So the differential expression result from DESeq2 is already corrected for the effect exerted by the subject ? How should I reconcile the DESeq2 results vs the PCA output ?

ADD REPLY
1
Entering edit mode

There's nothing to reconcile. Your samples are rather different, your treatment doesn't affect a whole lot of genes, but it affects some. DESeq will find them.

ADD REPLY
0
Entering edit mode

Oh that makes sense. I checked the results (p.adj < 0.05) and I have ~ 1100 genes that are significant for DE, but the log2FC range is only between -1.8 to +2.0. Thanks for helping me out !

ADD REPLY

Login before adding your answer.

Traffic: 2349 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6