Question

which one is better EdgeR or DEGSeq?

5

Entering edit mode

9.9 years ago

Whoknows ▴ 960

Hi

My project is on Rice, Cufflinks could not help me, I'd like to use edgeR or DEGSeq, But edgeR has more citation.

Could you please give/introduce me a simple pipeline for analyzing RNA-SEQ from FASTQ file to end !??

Thanks a lot.

RNA-Seq RPKM • 11k views

ADD COMMENT • link updated 2.5 years ago by Ram 43k • written 9.9 years ago by Whoknows ▴ 960

Ram · Answer 1 · 2014-06-20

Do you actually mean DEGseq, or do you instead mean DESeq? DEGSeq should not be used by anyone for any reason what so ever. Its statistical assumptions are wrong. DESeq (actually, do use DESeq2, it has a number of improvements), on the other hand, is quite good and what I normally use.

So then the question becomes which of edgeR and DESeq2 is better. There's really no single answer to that. DESeq2 has integrated independent filtering and per-gene outlier detection (using Cook's distance), which generally makes me favor it. edgeR, however, is also nicely written and there's no reason that the exact same features couldn't be used with it, though that'd take more work on your part. Having said that, edgeR has nicer integration with things like camera() and roast(), which is unsurprising given the overlap in authors. In general, give both a try and do some independent validation to see which is better modeling your data. That's really the most objective way to make the determination.

Ram · Answer 2 · 2014-06-20

Here are some relevant publications describing various comparisons of the many tools in this area:

Comparison of software packages for detecting differential expression in RNA-seq studies
A comparison of methods for differential expression analysis of RNA-seq data
compcodeR-an R package for benchmarking differential expression methods for RNA-seq data
Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data
A comparative study of techniques for differential expression analysis on RNA-Seq data (bioRXiv)

Ram · Answer 3 · 2014-06-20

3

Entering edit mode

9.9 years ago

Josh Herr 5.8k

RNA-Seq pipelines are many -- and they are often debated between experts and users. I came up with my own for my specific project, but largely I use BWA/bowtie2 and merge the resultant mapping into Cufflinks and/or edgeR/DESeq. I would recommend cufflinks -- what did not work for you? Many of us are working on organisms with much less information than Rice -- what type of errors have you received?

Your pipeline depends what you want to do - you'll need to do mapping first before you use DESeq or edgeR.

Here's a pretty exhaustive list of tools for RNA-Seq.

I would also recommend reading some papers in your area (I study plants and I know there are hundreds of papers on plant transcriptomes) -- what are other researchers using and why?

ADD COMMENT • link updated 2.5 years ago by Ram 43k • written 9.9 years ago by Josh Herr 5.8k

0

Entering edit mode

That's what I was wondering, there's even an iGenomes for rice from Illumina that should play nicely with Cufflinks

http://support.illumina.com/sequencing/sequencing_software/igenome.ilmn

ADD REPLY • link updated 2.5 years ago by Ram 43k • written 9.9 years ago by User 59 13k

Ram · Answer 4 · 2014-06-20

I use edgeR because I know how to use it and the documentation is extensive. The biggest challenge for people seems to be understanding how to define the design matrix and contrasts for using the linear modeling features. The documentation and many examples helps with that. But probably any of the methods mentioned above are fine. I prefer using the R libraries because then I can do everything in one language and environment, including making plots, doing exploratory data analysis, and so on. I don't like cuff* programs because they make too many files that are not well documented, and sometimes the p values don't seem to make any sense. But maybe that's because I'm not using it the right way.

Ram · Answer 5 · 2014-06-20

Sound like you've already gotten a lot of feedback, but here is my two cents:

Differential Expression Benchmarks: http://cdwscience.blogspot.com/2013/11/rna-seq-differential-expression.html
Simplified RNA-Seq Analysis Pipeline (sRAP): http://www.bioconductor.org/packages/release/bioc/html/sRAP.html

sRAP won't handle the first couple steps, but it sounds like you already have RPKM values from cufflinks and that is the starting point where sRAP takes over. I'm not saying the differential expression step in sRAP is necessary much better than DESeq, limma, etc., but you did specifically ask about a pipeline (beyond just the differential expression step) and I happened have to developed such a tool ;)