RNA-seq replicates pooled before sequencing. How to proceed with DGE analysis?
1
3
Entering edit mode
8.3 years ago
Sentinel156 ▴ 190

I'm helping a colleague analyse RNA-seq data to find differentially expressed genes. There are 4 conditions with 3 biological replicates each and we are interested in all possible pairwise comparisons. Unfortunately they made a big mistake by pooling RNA from each of the three biological replicates and sequencing as a single sample (i.e. they did not individually barcode each replicate). My colleague was using DESeq1 and was able to generate a list of diff expressed genes by analysing the data assuming no biological replicates. I encouraged using DESeq2 however this results in no differentially expressed genes being identified. I explained that without knowledge of gene expression variation its unlikely that anything can be done to statistically improve these results.

My question is, is there any technique/alternate method of analysis that the community could suggest? Or is their experiment essentially ruined?

Thanks

RNA-Seq • 4.3k views
ADD COMMENT
4
Entering edit mode

I would immediately reject it, if I were reviewing it for a journal. There is no way to estimate the sample-to-sample variance within each condition and the false positive rate is likely to be high.

ADD REPLY
4
Entering edit mode

While this criticism is valid and the OP seems to be aware of it, I think it sounds a bit too harsh. That data could still be used to generate hypotheses, genes with large fold change could be validated by qPCR in multiple samples, a pathway analysis could still reveal something and suggest further experiments. If the replicates happen to be very consistent then this dataset, while not conclusive, might be valuable. Sometimes the replication is done at the level of cell culture where replicates are so similar that doing no replicates or many is not that different, unless you are after tiny changes. (Just to be clear, I'm not advocating to avoid replication, just that once the damage is done...)

ADD REPLY
0
Entering edit mode

You could try GFOLD.

ADD REPLY
6
Entering edit mode
8.3 years ago
Irsan ★ 7.8k

Read section 2.1.1 of the edgeR manual: "What to do if you have no replicates". You can compile a list of 250 housekeeping genes based on this article and estimate the "random/backbround" variation. And afterwards tell your colleagues to first consult an expert before starting experiments.

ADD COMMENT
0
Entering edit mode

I'm late for this story, I read the edgeR manual for this kind of analysis. However, I need more explanation, could you please tell me how to estimate the "random/background" variation" based on housekeeping genes?

ADD REPLY

Login before adding your answer.

Traffic: 2093 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6