Stringtie: use htseq-count or prepDE.py to extract reads?
2
0
Entering edit mode
5.0 years ago
heyang • 0

Hi All,

I am currently working on downstream analysis of my RNAseq data. I have been using stringtie and its prepDE.py to extract the reads for DESeq2 for DE. Basically following http://ccb.jhu.edu/software/stringtie/index.shtml?t=manual. And also thinking of comparing DE with edgeR.

I came across a tutorial (https://github.com/griffithlab/rnaseq_tutorial/wiki/Expression), where they ran htseq-count on alignments instead to produce raw counts for edgeR.

I read questions in forum saying the two outputs are different.

So, my question is: which reads do I use?

Many thanks!

Stringtie RNA-Seq DESeq2 edgeR • 4.5k views
ADD COMMENT
0
Entering edit mode

I will put a question back to you: why did you choose to use StringTie? - you were obviously interested in de novo transcriptome assembly (via HISAT2 / StringTie)?

Results will of course differ between both approaches, but I would expect the real hits to be found from both datasets. It's the other genes that are on the fringes of expression and/or statistical significance that will differ.

ADD REPLY
5
Entering edit mode
5.0 years ago
ATpoint 82k

I recommend using tximport to correct for length bias between transcripts of the same gene. It is from the same developer as DESeq2 and are fully integrated into each other. You might also consider using salmon for quantification rather than classical alignment. Salmon features an elaborate way of dealing with multimappers and corrects for GC bias. Also, save yourself some time and do not start comparing edgeR and DESeq2. A proper comparison is not straight-forward and requires extensive knowledge of how exactly the two tools perform the analysis in order to make the comparison fair/adequate and reproducible. Better read benchmarking papers or blogs like this one from the DESeq2 developer:

https://mikelove.wordpress.com/2016/09/28/deseq2-or-edger/

Both tools are established and perform well. For most users it comes down to choosing the one you feel more comfortable with. In any case, try to validate important genes that you use to make a hypothesis with either independent experiments or published datasets from a comparable setup if possible.

ADD COMMENT
2
Entering edit mode
5.0 years ago

I wrote a section about consideration for quantification of RNASeq in my vignette that might be useful.

ADD COMMENT

Login before adding your answer.

Traffic: 2420 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6