Deseq, Edger And Cuffdiff - Different Result
1
10
Entering edit mode
11.9 years ago

Hi,

I performed differential expression analysis with DESeq, edgeR and cuffdiff on my data. Surprisingly, there are differences between DESeq, edgeR and cuffdiff. Here's a venn diagram of my results. I've 1001 DE gene with DESeq, 1447 DE gene with edgeR and "only" 149 DE gene with cuffdiff. Anyone can explain why cuffdiff is so stringent ?

Thanks,

enter image description here

rna-seq differential expression analysis • 24k views
ADD COMMENT
1
Entering edit mode

You should specify what versions of each tool you're using. The methods are sometimes updated between versions (especially Cufflinks, which has seen significant updates recently).

ADD REPLY
1
Entering edit mode

yeah, between the Cufflinks versions there are a lot differences!... see this post: http://gettinggeneticsdone.blogspot.com/2012/04/rna-seq-methods-march-twitter-roundup.html

ADD REPLY
0
Entering edit mode

the latest version for all of the I think : edgeR2.6.7 DESeq1.8.3 cufflinks 2.0.0

ADD REPLY
0
Entering edit mode

that is interesting. what is the overlap between DESeq and EdgeR?

ADD REPLY
0
Entering edit mode

intersection DESeq and edgeR : 938 genes. so very good intersection

ADD REPLY
7
Entering edit mode
11.9 years ago
Arun 2.4k

Cufflinks tries to identify the abundance of different transcripts within a sample. Given biological replicates, edgeR and DESeq try to identify if the observed difference between two conditions can be attributed significantly due to the experimental condition alone and not due to the biological variance. So, I don't understand what you call as differentially expressed between edgeR, DESeq and Cufflinks. You should explain your setup, experimental conditions, biological replicates and what is it you have obtained using cufflinks. You mean cuffdiff?

Its very important to know the question you are asking while conducing a statistical study. For example, cuffdiff tries to answer the question if there is a difference in a transcript expression in between two samples. edgeR and DESeq tries to answer if the difference in total expression (total count of a gene including all its isoforms) you see between samples is solely due to that alone and not due to biological variability.

Secondly, the differences I recall between edgeR and cufflinks are between the equation for variance and estimation of dispersion parameters. Given that you have biological replicates, then I would expect more genes to be similar than what you have obtained. If you ran the test without replicates, then it is relatively difficult to tell much about it as there is only so much information you have provided for the packages to estimate the dispersion parameters. DESeq tries to identify this by comparing other genes with similar expression pattern in case of no replicates, if I am not mistaken.

I would recommend filling in some gaps with regard to your setup to obtain (more) meaningful and helpful answers.

ADD COMMENT
0
Entering edit mode

I mean cuffdiff of course. I've a group of three samples (control) and eight samples (treated). I perfored DE analysis using these thress tools and compared the results.

ADD REPLY
0
Entering edit mode

How did you annotate your alignment - did you feed the cufflinks gene reports to DESeq/EdgeR so they are all reading off the same script?

ADD REPLY

Login before adding your answer.

Traffic: 1951 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6