I am trying to generate a more complete transcriptome using stringtie to look for more lowly expressed transcripts. I have taken 2 read files from the same study for Opisthorchis viverrini, aligned them (together, as if they were replicates) to the genome using STAR, converted sam to bam with samtools. My command for stringtie is below.
stringtie -p 8 -G Oviv_Annos.gff3 New_Annos.bam -o New_Annos.gtf
Stringtie runs with no errors, but when I extract the sequences from New_Annos.gtf, every transcript is annotated "STRG". I was hoping that only the novel transcripts would have this notation. Is there an option I am missing for me to do this?
Edit FIXED I think I have figured the problem out. The issue seems to be that running both samples as 1 sample gave annotations containing nothing but STRGs. Running each sample individually, including generating annotations for each sample then merging these gave original annotations plus novel STRGs.
"Running samples as one" approach gave ~21,000 transcripts, "Running samples individually then merging" approach gave ~30,000 transcripts. It seems that maybe the latter approach actually gives a deeper view of the transcriptome.
Thanks to those who offered advice