How to avoid MSTRG from StringTie
0
0
Entering edit mode
6.0 years ago
gozrom ▴ 80

Hi, there,

I've used Hisat-StringTie-ballgown pipeline and Using mouse genome 91 gtf file from ensemble creates MSTRG values as partial output of Stringtie,

The number of those MSTRG can be high, and I'm not sure is a real, as too many new transcripts, or unassembled transcripts.

Is there a different way to do it to avoid this probably technical issue? Using a different assembler?

Or a different gtf file?

This issue has been raised before, however, no good answer was provided: Gene names in Ballgown differential expression analysis How to deal with MSTRG tag without relevant gene name? Converting MSTRG from stringtie with gene name https://stackoverflow.com/questions/47621574/search-and-replace-between-two-files-post2

Thank you.

RNA-Seq • 4.5k views
ADD COMMENT
0
Entering edit mode

I guess you should be using better annotations (gtf file).

ADD REPLY
0
Entering edit mode

I used this one: ftp://ftp.ensembl.org/pub/release-91/gtf/mus_musculus/Mus_musculus.GRCm38.91.gtf.gz You suggest that this is not good enough?

ADD REPLY
1
Entering edit mode

I am not sure if the GTF you mentioned above includes all the transcriptome annotation. If you would like to restrict read alignments to annotations to GTF you supplied, use -e option (from http://ccb.jhu.edu/software/stringtie/index.shtml?t=manual) in stringtie execution. Try to use -C option as well. This would be useful to identify novel transcripts (MSTRG) with full coverage (of reads), if there are any.

ADD REPLY
0
Entering edit mode

I did use this option like it's been described in Pertea 2016 Hisat Stringtie ballgown paper. MSTRG are still there....

ADD REPLY
1
Entering edit mode

this MSTRG it's a nightmare, I've also tried the python script here

https :// gist.github.com/gpertea/b83f1b32435e166afa92a2d388527f4b

but at the end without success ...

any update about this issue?

Thank you

ADD REPLY

Login before adding your answer.

Traffic: 2344 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6