Zero Coverage for Some Exons andTranscripts ater quantification with StringTie
0
0
Entering edit mode
8 days ago
Emy Alade • 0

I am using StringTie v3.0.0 to discover novel transcripts from short-read RNA-seq data. After quantification, I noticed that some transcripts/exons have a coverage (cov) of 0, even though reads are aligned to these regions according to samtools view. Pipeline Used:

1-Transcript assembly per sample:

`stringtie -p 8 -c 2 -j 2 "$BAM_FILE" -G $GTF_FILE -o "$OUTPUT_GTF"`

2-Merging assembled transcripts:

`stringtie --merge $INPUT_DIR/*reconstructed.gtf -o $OUT_DIR/stringtie_merged_without_ref_output.gtf`

3-Quantification of transcripts:

`stringtie -p 12 -e -B $BAM_FILE -G $GTF_merged -o $OUTPUT_GTF/*_quantified.gtf`

Issue Observed:

For example, the transcript MSTRG.7.2 has zero coverage in t_data.ctab and _quantified.gtf despite read alignments. Results from grep in the .gtf file:

grep MSTRG.7.2 X2-WT_quantified.gtf

1   StringTie   transcript  4561613 4567577 .   -   .   gene_id "MSTRG.7"; transcript_id "MSTRG.7.2"; cov "0.0"; FPKM "0.000000"; TPM "0.000000";
1   StringTie   exon    4561613 4562891 .   -   .   gene_id "MSTRG.7"; transcript_id "MSTRG.7.2"; exon_number "1"; cov "0.0";
1   StringTie   exon    4563995 4564086 .   -   .   gene_id "MSTRG.7"; transcript_id "MSTRG.7.2"; exon_number "2"; cov "0.0";
1   StringTie   exon    4565359 4566165 .   -   .   gene_id "MSTRG.7"; transcript_id "MSTRG.7.2"; exon_number "3"; cov "0.0";
1   StringTie   exon    4566514 4567577 .   -   .   gene_id "MSTRG.7"; transcript_id "MSTRG.7.2"; exon_number "4"; cov "0.0";

Check in t_data.ctab:

grep MSTRG.7.2 t_data.ctab

t_id    chr strand  start   end t_name  num_exons   length  gene_id gene_name   cov FPKM
9   1   -   4561613 4567577 MSTRG.7.2   4   3242    MSTRG.7 .   0.000000    0.000000

Check in e_data.ctab (exon coverage):

    e_id    chr strand  start   end rcount  ucount  mrcount cov cov_sd  mcov    mcov_sd
    46  1   -   4561613 4562891 56  56  56.00   2.6654  1.8036  2.6654  1.8036
    43  1   -   4563995 4564086 1   1   1.00    2.8043  1.5965  2.8043  1.5965
    44  1   -   4565359 4566165 18  18  18.00   1.4597  1.3443  1.4597  1.3443
    45  1   -   4566514 4567577 6   6   6.00    0.3327  0.5931  0.3327  0.5931

Read counts verification with samtools:

samtools view -c X2-WT_S21Aligned.sortedByCoord.out.bam 1:4561613-4562891
58
samtools view -c X2-WT_S21Aligned.sortedByCoord.out.bam 1:4563995-4564086
6
samtools view -c X2-WT_S21Aligned.sortedByCoord.out.bam 1:4565359-4566165
21
samtools view -c X2-WT_S21Aligned.sortedByCoord.out.bam 1:4566514-4567577
6

We can see that reads are aligned to these exons, but cov remains 0 in t_data.ctab and _quantified.gtf. Questions: Why does StringTie assign a coverage of 0 even though reads are present in samtools view?

Image

novoassembly stringtie reconstruction RNAseq transcriptome • 176 views
ADD COMMENT

Login before adding your answer.

Traffic: 2327 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6