Question: Warning: no reference transcripts were found when map the reads to reference sequences by using stringtie
0
gravatar for Yuyin110
2.1 years ago by
Yuyin11010
Yuyin11010 wrote:

I do the transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown following the step of the paper of "Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown", but at the step of mapping the reads to the reference sequences, encountered a warning "no reference transcripts were found for the genomic sequences where reads were mapped! Please make sure the -G annotation file uses the same naming convention for the genome sequences." My run code is

for sample_name in $(cat samples.list)
do
stringtie -p 8 -G  sorghum/genes/ref_Sorghum_bicolor_NCBIv3_top_level.gtf -o stringtie/$sample_name.gtf  -l $sample_name  hist2/JGI/$sample_name.bam
done

It can get the .gtf file of each sample, the result also can do abundance estimation for Ballgown, but the abundance of all reference transcripts was zero, not zero abundance transcripts were novel transcripts, I think the result was not reliable, maybe the warning information is the problem, my reference sequences were download from NCBI(ftp://ftp.ncbi.nlm.nih.gov/genomes/Sorghum_bicolor/Assembled_chromosomes/seq/), and the annotation also download from NCBI(ftp://ftp.ncbi.nlm.nih.gov/genomes/Sorghum_bicolor/GFF/ref_Sorghum_bicolor_NCBIv3_top_level.gff3.gz), which was transformed to gtf format by the command

gffread  sorghum/genes/ref_Sorghum_bicolor_NCBIv3_top_level.gff3 -T -o 
 sorghum/genes/ref_Sorghum_bicolor_NCBIv3_top_level.gtf

and I used the gffread to examine the gtf file,

 gffread  -E sorghum/genes/ref_Sorghum_bicolor_NCBIv3_top_level.gtf

there are no error, and I am sure my annotation file uses the same naming convention for the genome sequences, but why the warning was encountered, and can not get the reference transcripts abundance.

rna-seq stringtie • 2.0k views
ADD COMMENTlink modified 7 months ago by Kanoo0 • written 2.1 years ago by Yuyin11010

I have the same problem, did you find out how to solve the issue?

ADD REPLYlink written 17 months ago by Giuseppe0

I have the same problem,and don't know how to solve it.

ADD REPLYlink written 7 months ago by Kanoo0

There might be some discrepancies between eg the sequence naming used for mapping (in the bam file) and the sequence names as they are present in the gff/gtf files or the specified gtf files does not have the necessary identifiers to use when extracting transcript info (for instance the use of gene_id, name , etc)

ADD REPLYlink written 7 months ago by lieven.sterck6.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1376 users visited in the last hour