Stringtie output shows no expression for reference genes
Entering edit mode
17 months ago
samhairle ▴ 20

Hi all,

I am using the stringtie - ballgown pipeline from and I am running into an issue with the outputs.

My resulting ballgown object has the reference genes present with 0 coverage in any sample, and 'new' transcripts with Stringtie IDs with coverage. Crucially, these do not overlap with the reference genes.

I know the reference transcripts are present because BUSCO can identify the expected single copy orthologues from the gtf file. I do not understand why Stringtie is not accepting them, or detecting any overlap.

I have read other stringtie annotation issues but the lack of overlap between the identified transcript and the reference doesn't seem to have come up before. At this point I am lost - is there some step I am missing or have overlooked? I would greatly appreciate any advice.

Code is below:

  1. for each RNA-Seq sample, map the reads to the genome with HISAT2 using the --dta option (used a reference genome (.fa) index built using HiSat2)

hisat2 -p 10 --no-discordant --no-mixed --dta --rna-strandness RF --mp 4,2 --rdg 5,3 -x hisat2_gen_index -q -1 1_1.Q20.fastq -2 1_2.Q20.fastq -S 1_AN_.sam 2> 1_AN_report.txt

samtools sort -@ 8 -o 1_AN_.sorted.bam 1_AN_.sam

  1. for each RNA-Seq sample, run StringTie to assemble the read alignments obtained in the previous step (used an annotation file; same chromosome naming convention as the reference genome. I have tried with this file as gff and as gtf, neither worked)

stringtie 1_AN_.sorted.bam -G gtf_anno.gtf -A -f 0.005 -p 10 -o 1_gtf_stringtie_assembly.gtf

  1. ran StringTie with --merge in order to generate a non-redundant set of transcripts observed in any of the RNA-Seq samples assembled previously. (used the annotation file again)

stringtie --merge -m 20 -p 10 -f 0.005 -G gtf_anno.gtf -o merged_gtf_stringtie_assembly.gtf gtf_mergelist.txt

4 . for each RNA-Seq sample, run StringTie using the -B/-b options in order to estimate transcript abundances and generate read coverage tables for Ballgown.

stringtie -B -f 0.005 -p 10 1_AN_loomismatch_defgap.sorted.bam -G merged_gtf_stringtie_assembly.gtf -A ballgown_gtf/1_gtf/ -o ballgown_gtf/1_gtf/1_gtf_stringtie_merged_assembly.gtf

stringtie programming annotation transcriptome rna • 348 views

Login before adding your answer.

Traffic: 1769 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6