RNASeq differential expression masked by pathways disregulation
1
0
Entering edit mode
14 months ago
Gama313 ▴ 120

I am working on a 20 sample dataset. I need to isentify De genes. 10 samples were collected in the collection center (CC) 1 and 10 samples in the CC2. Each group has 5 samples with both condition1 and condition2. Unfortunately, when I tested DE CC1 vs CC2 only, there are several survival pathways upregulated in CC2 samples. This is due to the fact that CC2 send us samples later respect to CC1 (in terms of days). When I calculated DE among condition1 vs condition2 (using CC as covariate) I cannot observe DE genes (FDR 0.05). From PCA I see that differences in CC are far stronger than condition1 vs 2. To my knowledge, both Combat and remoBatch (limma) destroy biologic variability so I am not confident of using them. My question is: what to do in this setup?

normalization Rnaseq batch • 933 views
ADD COMMENT
0
Entering edit mode
14 months ago
ATpoint 81k

Tools like DESeq2, edgeR and limma recommend in their manuals to include covariates such as batch (here that is center) into the model. Try that and see how it goes. If you want more feedbacks please show code, data and plots.

ADD COMMENT
0
Entering edit mode

Thanks for the answer. However I am not sure this could be considered batch since CC2 samples have true upregulation of specific pathways that could (in principle) buffer variation when condition 1 vs 2 are tested.

ADD REPLY
0
Entering edit mode

If this is the case then you cannot correct for anything as this effect is nested with center.

ADD REPLY
0
Entering edit mode

It you really do have a reason to suspect that the biology of samples from CC1 is different from the biology of samples from CC2, and that CC1 is the correct biology and CC2 the wrong biology, then their may be an arguement from discarding the samples from CC2 and just conducting the analysis on the samples from CC1.

ADD REPLY
0
Entering edit mode

That's exactly what I've done. However the dispersion is really high (primary samples) and the total number of samples seems really low to obtain results with a good confidence.

ADD REPLY

Login before adding your answer.

Traffic: 1981 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6