Question: RNA-seq replicates pooled before sequencing. How to proceed with DGE analysis?
gravatar for Sentinel156
4.5 years ago by
Melbourne, Australia
Sentinel156130 wrote:

I'm helping a colleague analyse RNA-seq data to find differentially expressed genes. There are 4 conditions with 3 biological replicates each and we are interested in all possible pairwise comparisons. Unfortunately they made a big mistake by pooling RNA from each of the three biological replicates and sequencing as a single sample (i.e. they did not individually barcode each replicate). My colleague was using DESeq1 and was able to generate a list of diff expressed genes by analysing the data assuming no biological replicates. I encouraged using DESeq2 however this results in no differentially expressed genes being identified. I explained that without knowledge of gene expression variation its unlikely that anything can be done to statistically improve these results.

My question is, is there any technique/alternate method of analysis that the community could suggest? Or is their experiment essentially ruined?


rna-seq • 2.8k views
ADD COMMENTlink modified 4.5 years ago by Irsan7.2k • written 4.5 years ago by Sentinel156130

I would immediately reject it, if I were reviewing it for a journal. There is no way to estimate the sample-to-sample variance within each condition and the false positive rate is likely to be high.

ADD REPLYlink written 4.5 years ago by dario.garvan470

While this criticism is valid and the OP seems to be aware of it, I think it sounds a bit too harsh. That data could still be used to generate hypotheses, genes with large fold change could be validated by qPCR in multiple samples, a pathway analysis could still reveal something and suggest further experiments. If the replicates happen to be very consistent then this dataset, while not conclusive, might be valuable. Sometimes the replication is done at the level of cell culture where replicates are so similar that doing no replicates or many is not that different, unless you are after tiny changes. (Just to be clear, I'm not advocating to avoid replication, just that once the damage is done...)

ADD REPLYlink written 4.5 years ago by dariober11k

You could try GFOLD.

ADD REPLYlink written 4.5 years ago by geek_y11k
gravatar for Irsan
4.5 years ago by
Irsan7.2k wrote:

Read section 2.1.1 of the edgeR manual: "What to do if you have no replicates". You can compile a list of 250 housekeeping genes based on this article and estimate the "random/backbround" variation. And afterwards tell your colleagues to first consult an expert before starting experiments.

ADD COMMENTlink modified 4.5 years ago • written 4.5 years ago by Irsan7.2k

I'm late for this story, I read the edgeR manual for this kind of analysis. However, I need more explanation, could you please tell me how to estimate the "random/background" variation" based on housekeeping genes?

ADD REPLYlink written 4.1 years ago by seta1.4k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 848 users visited in the last hour