I do the transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown following the step of the paper of "Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown", but at the step of mapping the reads to the reference sequences, encountered a warning "no reference transcripts were found for the genomic sequences where reads were mapped! Please make sure the -G annotation file uses the same naming convention for the genome sequences." My run code is
for sample_name in $(cat samples.list) do stringtie -p 8 -G sorghum/genes/ref_Sorghum_bicolor_NCBIv3_top_level.gtf -o stringtie/$sample_name.gtf -l $sample_name hist2/JGI/$sample_name.bam done
It can get the .gtf file of each sample, the result also can do abundance estimation for Ballgown, but the abundance of all reference transcripts was zero, not zero abundance transcripts were novel transcripts, I think the result was not reliable, maybe the warning information is the problem, my reference sequences were download from NCBI(ftp://ftp.ncbi.nlm.nih.gov/genomes/Sorghum_bicolor/Assembled_chromosomes/seq/), and the annotation also download from NCBI(ftp://ftp.ncbi.nlm.nih.gov/genomes/Sorghum_bicolor/GFF/ref_Sorghum_bicolor_NCBIv3_top_level.gff3.gz), which was transformed to gtf format by the command
gffread sorghum/genes/ref_Sorghum_bicolor_NCBIv3_top_level.gff3 -T -o sorghum/genes/ref_Sorghum_bicolor_NCBIv3_top_level.gtf
and I used the gffread to examine the gtf file,
gffread -E sorghum/genes/ref_Sorghum_bicolor_NCBIv3_top_level.gtf
there are no error, and I am sure my annotation file uses the same naming convention for the genome sequences, but why the warning was encountered, and can not get the reference transcripts abundance.