I ran DGE on my dataset using DESeq2 and Limma. I used the same dataset, similar settings and same GLM model (
~family+condition). The number of DEGs I get are as below (based on unadjusted p-values <0.05 and BH adjusted q-values <0.05 shown separately):
Tool num_degs_pval num_degs_qval DESeq2 3155 623 Limma 2853 32
I plotted qqplots using unadjusted pvalues from both DESeq2 and Limma results.
library(snpStats) qq.chisq(-2*log(deseq2_results$pvalue),df=2,pvals=T,overdisp=T) qq.chisq(-2*log(limma_results$pvalue),df=2,pvals=T,overdisp=T)
This is what I get:
- Is the qqplot useful in RNA-Seq to estimate DGE quality?
- In this example, does the better fit of Limma mean that the fewer set of DEGs returned by Limma is more reliable?
- Similarly, does the inflated curve for DESeq2 mean that the huge number of DEGs returned by DESeq2 contains more false positives?
Couple of caveats: The function I used
qq.chisq is for chisq tests and I am not sure if that is an issue since my Deseq2 script uses Wald test and Limma uses bayesian
eBayes test. The
df argument in
qq.chisq is set to 2 arbitrarily. I am not sure what value I should plug in here.