Stringtie de novo VS reference guided (-G)
Entering edit mode
7 months ago

Hi Community,

I am currently preforming RNA-seq analysis of human dataset and my aim is to find novel transcripts and isoforms. I have aligned the sequences to the reference genome using Hisat2 and assembled the transcripts using Stringtie in both reference guided and de novo methods. When I looked at the number of assembled transcripts I see reference guided mode has assembled twice number of transcripts than de novo method:

stringtie denovo: exon 400375; transcript 59007

stringtie reference-guided (-G): exon 708407; transcript 121080

My question is: Is it normal to see this difference and if so could you please, help me in understanding the reason or refer to any article.

Thanks in advance.

reference-guided denovo stringtie RNA-seq • 396 views
Entering edit mode
7 months ago

The Stringtie manual states:

Although StringTie is primarily a genome-guided approach, it can borrow algorithmic techniques from de novo genome assembly to help with transcript assembly.

The above implies that it will perform more effectively in guided mode, which is what you also observe. That being said what the difference should be will depend on a whole slew of properties of both your sequence data and the reference annotations.

I would recommend to investigate and compare the regions where one approach produces more transcripts than the other.

  • What kind of commonalities can you observe.
  • What kinds of tradeoffs can you observe
  • Which assembly appears to be more trustworthy

Login before adding your answer.

Traffic: 1851 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6