stringtie coverage -c parameter?
1
0
Entering edit mode
18 months ago
MACRODER • 0

I have used stringtie in the novo mode to assemble transcripts from my own RNA-seq data. The code I used was:

stringtie input.bam -o transcripts.gtf --rf


I visually inspected in IGV the input.bam file, the original annotation file for the genome (from a protozoan poorly annotated) and transcripts.gtf produced by stringtie, and found that some of the novel transcripts have a really low coverage of RNA-Seq reads. So, I would like to re-do the analysis with some threshold in coverage. I know that this has to do with the -c parameter.

from the manual: -c: Sets the minimum read coverage allowed for the predicted transcripts. A transcript with a lower coverage than this value is not shown in the output. Default: 1.

Now my problem is, how is this coverage calculated? -c 1 means that stringtie will keep all transcripts with at least 1 read? If I change this value, let's say to -c 80, means that will keep transcripts with 80 or more reads? My RNA-seq data is paired-end.

Sorry if my vocabulary is not accurate, it is the first time I do transcriptome assembly. Thank you!

stringtie coverage transcript assembly • 645 views
0
Entering edit mode
18 months ago
nlehmann ▴ 130

From StringTie manual (in the "output" section), the coverage is defined as:

cov: The average per-base coverage for the transcript or exon.

So,

-c 1 means that stringtie will keep all transcripts with at least 1 read? If I change this value, let's say to -c 80, means that will keep transcripts with 80 or more reads?

The answer is yes for both your questions. You can run it with different values of -c and visualize the results in IGV to see how it impacts your stringtie annotation.