Question: Hi blog how to calculate differntial expression analysis if thre is no replicates in count data using edgeR?
gravatar for gskbioinfo143
5.7 years ago by
gskbioinfo14360 wrote:

Hey i followed this tutorial for DE analysis using edgeR but i couldnt able to plot graphs.

for my data of 20812 gene counts

Gene    Control    CASE_HYD004
SEC24B-AS1    5    16
A1BG    0    3
A1CF    0    0
GGACT    2    0
A2M    8    1572
A2ML1    0    0
A2MP1    2    3
A4GALT    3    71
A4GNT    0    0
AAAS    79    60
AACS    66    567

its getting error like no replicates, in this case how to calculate dispersion using edgeR.

kindly need valuable suggestions. if thre is any other accurate tool other than edgeR



edger • 2.6k views
ADD COMMENTlink modified 5.7 years ago by SmallChess540 • written 5.7 years ago by gskbioinfo14360

It is a pretty bad idea to work without biological replicates when doing differential gene expression analysis, as it is difficult to account for  biological variability. The edgeR authors also explicitely advise against working without replicates. Nevertheless, there is a section on this topic in the edgeR user guide (page 19). Reading that section might be more useful than the normal tutorials, that assume replication.


You can also use the R packages DESeq or DESeq2 for the same purpose (although slightly different mehodically), but the authors of that tool also strongly advise against using unreplicated data.

ADD REPLYlink modified 5.7 years ago • written 5.7 years ago by utzermel120
gravatar for h.mon
5.7 years ago by
h.mon32k wrote:

As utzermel said, you should not infer DE without biological replicates - this paper and this tech note provide clear reasoning and examples why not.

That said, funding reality oftentimes trumps best practices. The Trinity folks wrapped a script and tutorial to analyze DE even without replication, the caveat is I believe it will work only with a transcriptome assembled by Trinity - you should be able to read the script and adapt to your case.

ADD COMMENTlink written 5.7 years ago by h.mon32k
gravatar for Martombo
5.7 years ago by
Seville, ES
Martombo2.7k wrote:
since you only have one sample, it's quite evident that you cannot compute a value for the dispersion. if you use deseq or deseq2, the program will anyway compute a "mock" dispersion for every gene, by considering the two samples as replicate of the same condition. this follows the assumption that most genes are not differentially expressed. in this way you'll anyway be able to fit a dispersion to mean trend and produce a final moderate estimation of the dispersion of every gene. this dispersion is likely to be a little overestimated, but it will allow you to model your data on a negative binomial distribution. given the weak statistical setup you won't probably get any significant DE gene, but it could be informative to have a look at the top ones.
ADD COMMENTlink modified 5.7 years ago • written 5.7 years ago by Martombo2.7k
gravatar for SmallChess
5.7 years ago by
SmallChess540 wrote:

You'll need replicates to estimate a negative-binomial model which outputs the dispersion among replicates. Basically, dispersion measures your biological replicates and technical replicates.


You could also assume that your biological replicates can be ignored (not a good assumption). Mock a few values for each of the condition. For example:

    A4GALT    3

can be mocked into

    A4GALT    3  2 4

2 and 4 are assumed be replicates that you'll get if you did the experiment.

ADD COMMENTlink written 5.7 years ago by SmallChess540
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2532 users visited in the last hour