Question

What is the reason for big difference in DEGs estimation compared to previous estimation?

0

Entering edit mode

5.4 years ago

anc.informatics • 0

I tried to estimate the DEGs for a set of transcriptome from the study. In the analysis pipeline, after adapter removal using Trim galore, I first mapped the reads to reference using Hisat2. Then estimated the counts using featureCounts. These raw counts were input to DESEq2 for DEGs estimation following the tutorial.

My problem here is only around 30-40% of top up or down-regulated DEGs (for ex, among top 600 genes) estimated in the study (whole DEGs list provided as suppl files) match with my estimation. Please note that GTF file and genome index files are same in both analysis.

So at what point the big difference occurs? Is there something wrong what I did in my pipeline? Am aware that Tophat is outdated and will it make such a big difference in estimation?

RNA-Seq Tophat DEGs • 1.0k views

ADD COMMENT • link updated 5.4 years ago by jared.andrews07 ★ 16k • written 5.4 years ago by anc.informatics • 0

score 1 · Answer 1 · 2018-12-07

Well, it's not necessarily Tophat, but cufflinks and DESeq2 are very different methodologies. I would not be overly worried if you don't capture 100% of their results. How does it look in the opposite direction (how many of your identified DEGs are in their list)? You can't compare apples to apples since you are using a very different (and better) method. You didn't necessarily do anything wrong.