Question: DESeq2 and edgeR - no agreement between results
gravatar for ilyco
3.7 years ago by
United Kingdom
ilyco50 wrote:


I ran edgeR and DESeq2 on two datasets each with 5 conditions. For each condition, I have at least three replicates. However, the results don't seem to match up between DESeq2 and edgeR. Only about 20-30% of the genes called by DESeq as DE are also in the set called by edgeR. DESeq2 also seems to call more genes than edgeR but there is no clear trend across conditions.

I tried running DESeq2 with the same initial filtering as in edgeR and with independentFiltering=FALSE and the agreement percentages between the resulting sets of genes are not changed. 

Do you know what are the possible causes of this difference in results?

I tried different sets of inputs counts (transcripts from different sources, genes etc). 


edger rna-seq deseq2 • 3.8k views
ADD COMMENTlink written 3.7 years ago by ilyco50

What question are you asking?  DESeq2 and edgeR are different software packages and do, indeed, often give different results.  If you are concerned about your code, you'll need to post your code.  If you are asking about the differences in results, I'm not sure that we will be able to help too much, as some difference is expected.  

ADD REPLYlink written 3.7 years ago by Sean Davis25k

Some difference is indeed to be expected but a negative correlation between logFC changes and only a 20% match in results seems a bit over the top. I am looking for potential causes. The code is rather long and does not use anything new. I simply copy pasted the typical commands from the vignettes of the packages and set an FDR of 0.01. 

ADD REPLYlink written 3.7 years ago by ilyco50

20% match in results is, in my experience, not that unusual, particularly since you are using a low FDR (1%).  The negative correlation in logFC is simply a switch in the direction when calculating contrasts.  Just compare abs(logFC) to clear up that problem.  Finally, the estimates of logFC are not simple ratios in either DESeq2 or edgeR, so there will be some differences between the two packages.   

ADD REPLYlink written 3.7 years ago by Sean Davis25k

How many genes are called fo each?

  • The normalization used could be yield different results. Are the logFC you obtained correlated?
  • DESeq and EdgeR tests if the logFC is significanlty above a threshold ( |logFC|>thres ). Maybe they differ.
ADD REPLYlink written 3.7 years ago by cyril-cros890

In each case, DESeq2 calls about 2000 genes and edgeR about 1000-1400.

I checked the correlation between the logFC in DESeq2 and edgeR and it is very bad in every case. For one condition, it even correlates negatively. I am not sure why. 

ADD REPLYlink written 3.7 years ago by ilyco50

I've done a lot of this in the past and the logFCs should look very, very similar: basically a straight line along the x=y diagonal with perhaps some outliers. If not there's an issue with the raw counts and/or how the normalisation was done.

A negative correlation is probably due to the comparison being reversed in one of the two tools.

I would break this down by looking at the simple case first, only one comparison, and see if you get better agreement. If you do build it up from there.

ADD REPLYlink written 3.7 years ago by Chris Cole710

Thank you for the suggestion. It helped a lot to choose only one condition and check every step. The "normalised" counts correlate very well but logFC still correlates poorly between edgeR and DESeq2 for each condition.

ADD REPLYlink written 3.7 years ago by ilyco50

Can you show the count and logFC correlations?

ADD REPLYlink written 3.7 years ago by Chris Cole710

What kind of conditions are you using? 2000 genes seems a bit much. If you take a cutoff value for the adjusted p-value, do you get a more reasonable common set? Do you see different logFC values on the common set you previously had?

ADD REPLYlink modified 3.7 years ago • written 3.7 years ago by cyril-cros890

I applied a cutoff for the adjusted pvalue of 0.01. The different logFC values are for all the genes not just the differentially expressed ones. I ran the typical:

dds <- dds[ rowSums(counts(dds)) > 1, ]
dds <- DESeq(dds)
res <- results(dds)

in DESeq2 

and the typical:

y <- DGEList(counts = counts, group = condition)
keep <- rowSums(cpm(y) > 2) >= 3
y <- y[keep, , keep.lib.sizes=FALSE]
y <- calcNormFactors(y)
design <- model.matrix(~ 0 + condition, data = y$samples)
y <- estimateDisp(y, design, robust=TRUE) 
fit <- glmFit(y,design)

in edgeR.

My design is 6 replicates for condition A, 6 for condition B and 3 for condition C. I am interested in the contrasts between condition C and A and between B and A. After applying the contrasts, I took the logFC values of the common set of genes. They correlated negatively for C-A and poorly for B-A. 

ADD REPLYlink modified 3.7 years ago • written 3.7 years ago by ilyco50

DeSeq2: Can you post the results for your 0.01 FDR of summary(res)?  And a MA plot? I am still surprised by 2000 DE genes. Look up the DeSeq2 vignette (3.9 Tests of log2 fold change above or below a threshold). If too many genes do indeed change between your conditions, it might affect the normalization (everybody, please correct me if I am wrong), and DeSeq2/edgeR might be imprecise.

If you take only the 500 most changed genes with each method, do you see a better agreement between methods?

ADD REPLYlink modified 3.7 years ago • written 3.7 years ago by cyril-cros890

Here is a summary of the results with DESeq2:

> summary(res)
out of 15630 with nonzero total read count
adjusted p-value < 0.01
LFC > 0 (up)     : 2076, 13% 
LFC < 0 (down)   : 1660, 11% 
outliers [1]     : 1368, 8.8% 
low counts [2]   : 1366, 8.7% 
(mean count < 2)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results


and an MA plot of the same object:

ADD REPLYlink modified 3.7 years ago • written 3.7 years ago by ilyco50

I you compare with the DeSeq2 MA plot in the manual, you see you have a lot more variations (just looking at the 1-2 range, not even the outliers). It seems the really lowly expressed genes were corrected, but you still have too many changing genes which is not good for DeSeq2 or EdgeR. They assume most genes have an unchanged expression. Do you really expect massive changes between your conditions?

I am a bit unsure about how to proceed from here. You might only consider the 500 genes with the highest adjusted p-value, maybe.

ADD REPLYlink written 3.7 years ago by cyril-cros890
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 858 users visited in the last hour