Stringtie issue. "Error: no valid ID found for GFF record"
0
0
Entering edit mode
15 months ago
Pegasus ▴ 100

Hi all,

Using Linux, I successfully aligned my rna-seq data using Hisat2, converted sam to bam files, sorted them using samtools, and include these bam files with gtf.file in stringtie command line.

However, I got this "Error: no valid ID found for GFF record"

I consulted a previous post; stringtie Error: input file cannot be found

But I am still facing the same problem.

Here is the head -10 result;

![Thank you]enter image description here

Stringtie RNA-seq • 1.4k views
ADD COMMENT
1
Entering edit mode

Could you post the output instead of a screenshot? There seems to be an empty line there. Could you try to read the file with gffread? http://ccb.jhu.edu/software/stringtie/gff.shtml

ADD REPLY
0
Entering edit mode

Thank you barslmn,

I checked both gff.file and gtf.file for the same RG using gffread as below;

gff

*JAFJXZ010000001.1  Genbank gene    1   237 .   +   .   ID=gene-JYU28_00005;geneID=gene-JYU28_00005
JAFJXZ010000001.1   Genbank CDS 1   237 .   +   0   Parent=gene-JYU28_00005
JAFJXZ010000001.1   Genbank pseudogene  412 934 .   +   .   ID=gene-JYU28_00010;geneID=gene-JYU28_00010
JAFJXZ010000001.1   Genbank CDS 412 934 .   +   0   Parent=gene-JYU28_00010
JAFJXZ010000010.1   cmsearch    rRNA    1   224 .   -   .   ID=rna-JYU28_03015;geneID=gene-JYU28_03015
JAFJXZ010000010.1   cmsearch    exon    1   224 .   -   .   Parent=rna-JYU28_03015
JAFJXZ010000010.1   cmsearch    rRNA    480 577 .   -   .   ID=rna-JYU28_03020;geneID=gene-JYU28_03020;gene_name=rrf
JAFJXZ010000010.1   cmsearch    exon    480 577 .   -   .   Parent=rna-JYU28_03020
JAFJXZ010000010.1   cmsearch    rRNA    1764    4363    .   -   .   ID=rna-JYU28_03025;geneID=gene-JYU28_03025
JAFJXZ010000010.1   cmsearch    exon    1764    4363    .   -   .   Parent=rna-JYU28_03025
JAFJXZ010000011.1   cmsearch    rRNA    1   453 .   +   .   ID=rna-JYU28_03030;geneID=gene-JYU28_03030
JAFJXZ010000011.1   cmsearch    exon    1   453 .   +   .   Parent=rna-JYU28_03030
JAFJXZ010000012.1   cmsearch    rRNA    1   228 .   +   .   ID=rna-JYU28_03035;geneID=gene-JYU28_03035
JAFJXZ010000012.1   cmsearch    exon    1   228 .   +   .   Parent=rna-JYU28_03035
JAFJXZ010000012.1   cmsearch    rRNA    381 497 .   +   .   ID=rna-JYU28_03040;geneID=gene-JYU28_03040;gene_name=rrf
JAFJXZ010000012.1   cmsearch    exon    381 497 .   +   .   Parent=rna-JYU28_03040
JAFJXZ010000012.1   Genbank gene    712 1704    .   +   .   ID=gene-JYU28_03045;geneID=gene-JYU28_03045
JAFJXZ010000012.1   Genbank CDS 712 1704    .   +   0   Parent=gene-JYU28_03045
JAFJXZ010000012.1   Genbank gene    1834    3969    .   +   .   ID=gene-JYU28_03050;geneID=gene-JYU28_03050
JAFJXZ010000012.1   Genbank CDS 1834    3969    .   +   0   Parent=gene-JYU28_03050
JAFJXZ010000012.1   Genbank gene    4079    4216    .   +   .   ID=gene-JYU28_03055;geneID=gene-JYU28_03055
JAFJXZ010000012.1   Genbank CDS 4079    4216    .   +   0   Parent=gene-JYU28_03055
JAFJXZ010000012.1   Genbank gene    4399    5565    .   -   .   ID=gene-JYU28_03060;geneID=gene-JYU28_03060
JAFJXZ010000012.1   Genbank CDS 4399    5565    .   -   0   Parent=gene-JYU28_03060
JAFJXZ010000012.1   Genbank gene    5914    6042    .   -   .   ID=gene-JYU28_03065;geneID=gene-JYU28*

gtf using gffread

no valid ID found for GFF record

gtf using head -10

!gtf-version 2.2
!genome-build ASM1758969v1
!genome-build-accession NCBI_Assembly:GCA_017589695.1
!annotation-date 02/26/2021 15:56:28
!annotation-source NCBI 
JAFJXZ010000001.1   Genbank gene    1   237 .   +   .   gene_id "JYU28_00005"; transcript_id ""; gbkey "Gene"; gene_biotype "protein_coding"; locus_tag "JYU28_00005"; partial "true"; 
JAFJXZ010000001.1   GeneMarkS-2+    CDS 1   234 .   +   0   gene_id "JYU28_00005"; transcript_id "unassigned_transcript_1"; gbkey "CDS"; inference "COORDINATES: ab initio prediction:GeneMarkS-2+"; locus_tag "JYU28_00005"; partial "true"; product "IS5/IS1182 family transposase"; protein_id "MBO3282641.1"; transl_table "11"; 
JAFJXZ010000001.1   GeneMarkS-2+    start_codon 1   3   .   +   0   gene_id "JYU28_00005"; transcript_id "unassigned_transcript_1"; gbkey "CDS"; inference "COORDINATES: ab initio prediction:GeneMarkS-2+"; locus_tag "JYU28_00005"; partial "true"; product "IS5/IS1182 family transposase"; protein_id "MBO3282641.1"; transl_table "11"; 
JAFJXZ010000001.1   GeneMarkS-2+    stop_codon  235 237 .   +   0   gene_id "JYU28_00005"; transcript_id "unassigned_transcript_1"; gbkey "CDS"; inference "COORDINATES: ab initio prediction:GeneMarkS-2+"; locus_tag "JYU28_00005"; partial "true"; product "IS5/IS1182 family transposase"; protein_id "MBO3282641.1"; transl_table "11"; 
JAFJXZ010000001.1   Genbank gene    412 934 .   +   .   gene_id "JYU28_00010"; transcript_id ""; gbkey "Gene"; gene_biotype "pseudogene"; locus_tag "JYU28_00010"; pseudo "true";*

Both files did not show a similar pattern to the one posted in the GFF utilities;

http://ccb.jhu.edu/software/stringtie/gff.shtml

Following the thread,

StringTIe Error: no valid ID found for GFF record

I believe Stringtie could not read my gtf.file (because gffread couldn't read it), maybe there is a problem in the trancript_ID, and gene_ID, so I still need recommendations ( which is the right command line to fix it)

Thank you,

ADD REPLY
0
Entering edit mode

Could you try to create your annotation file from the table browser? https://genome.ucsc.edu/cgi-bin/hgTables

ADD REPLY
0
Entering edit mode

Hi barslmn,

It is a bacterial genome, which I believe is not supported by this website. I just updated the previous thread (gffread could not read my gtf.file)

ADD REPLY

Login before adding your answer.

Traffic: 1719 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6