Question: How has StringTie caculated the transcript coverage?
3
16 months ago by
ddzhangzz90
United States
ddzhangzz90 wrote:

Recently I have used Stringtie to compute the reads of RNASeq mapping to transcripts. There are two transcripts of a gene with exactly same length and number of exons (as well as the assembly structure of the two transcripts) and I found the coverages were very different from each other.

 ##transcript
t_id    chr     strand  start   end     t_name             _exons       length  gene_id              ene_name   cov             FPKM
77237   chr17   -       7668402 7687538 ENST00000269305.7       11      2579    ENSG00000141510.14      TP53    31.946598       5.549151
77238   chr17   -       7668402 7687538 ENST00000620739.3       11      2579    ENSG00000141510.14      TP53    2.961419        0.514401


I am wondering how the stringtie has calculated the coverage. By its definition and if my understand were correct, the coverage was calculated as \sum{seq_i*mapped-seq-length_i}{i=1}{m}/transcript_length. If this is true, I expect the coverage should be similar of these two transcripts but why they were so different.

rna-seq • 1.2k views
modified 16 months ago by geo.pertea70 • written 16 months ago by ddzhangzz90

Did you find the solution anywhere else? we are struggling to find out the same. It is not clear anywhere.

you may follow up with this post on github. may be someone is listening

https://github.com/gpertea/stringtie/issues/162

0
16 months ago by
geo.pertea70
geo.pertea70 wrote:

Please see this answer about how coverage values are calculated by StringTie. Transcript and exon coverage values for overlapping transcripts (alternate isoforms) are calculated after distributing the read alignments according to the maximum flow algorithm -- it's not as simple as applying a formula.

For this particular question, without further data I presume that ENST00000269305.7 and ENST00000620739.3 are somehow distinct isoforms (so not exactly identical in their intron-exon structure, otherwise one of them would be discarded when the input file is loaded).