Question: Correct gtf file format (AGAT toolkit)
0
gravatar for tianshenbio
4 months ago by
tianshenbio50
tianshenbio50 wrote:

I used agat_convert_sp_gff2gtf.pl of AGAT toolkit to convert my gff file to gtf file. In the converted gtf file, the double quotes of 'gene_id' are missing:

Bany_Scaf1  maker   gene    201136  207903  .   +   .   Alias "maker-Bany_Scaf1-snap-gene-2.23"; Dbxref "InterPro:IPR019774" "Pfam:PF00351"; ID Bany_03723; Name Bany_03723; Ontology_term "GO:0016714" "GO:0055114"; gene_id Bany_03723
Bany_Scaf1  maker   transcript  201136  207903  .   +   .   Alias "maker-Bany_Scaf1-snap-gene-2.23-mRNA-1"; Dbxref "InterPro:IPR019774" "Pfam:PF00351"; ID "Bany_03723-RA"; Name "Bany_03723-RA"; Ontology_term "GO:0016714" "GO:0055114"; Parent Bany_03723; _AED "0.06"; _QI "45|1|1|1|1|1|7|425|530"; _eAED "0.06"; gene_id Bany_03723; original_biotype mrna; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   exon    201136  201304  .   +   .   ID "Bany_03723-RA:1"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   exon    202687  202770  .   +   .   ID "Bany_03723-RA:2"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   exon    202886  202921  .   +   .   ID "Bany_03723-RA:3"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   exon    203004  203820  .   +   .   ID "Bany_03723-RA:4"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   exon    206097  206223  .   +   .   ID "Bany_03723-RA:5"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   exon    206649  206878  .   +   .   ID "Bany_03723-RA:6"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   exon    207304  207903  .   +   .   ID "Bany_03723-RA:7"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   CDS 201181  201304  .   +   0   ID "Bany_03723-RA:cds"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   CDS 202687  202770  .   +   2   ID "Bany_03723-RA:cds"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   CDS 202886  202921  .   +   2   ID "Bany_03723-RA:cds"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   CDS 203004  203820  .   +   2   ID "Bany_03723-RA:cds"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   CDS 206097  206223  .   +   1   ID "Bany_03723-RA:cds"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   CDS 206649  206878  .   +   0   ID "Bany_03723-RA:cds"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   CDS 207304  207478  .   +   1   ID "Bany_03723-RA:cds"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   five_prime_utr  201136  201180  .   +   .   ID "Bany_03723-RA:five_prime_utr"; Parent "Bany_03723-RA"; gene_id Bany_03723; original_biotype five_prime_UTR; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   three_prime_utr 207479  207903  .   +   .   ID "Bany_03723-RA:three_prime_utr"; Parent "Bany_03723-RA"; gene_id Bany_03723; original_biotype three_prime_UTR; transcript_id "Bany_03723-RA"

my gff (already corrected by AGAT.

Bany_Scaf1  maker   gene    201136  207903  .   +   .   ID=Bany_03723;Alias=maker-Bany_Scaf1-snap-gene-2.23;Dbxref=InterPro:IPR019774,Pfam:PF00351;Name=Bany_03723;Ontology_term=GO:0016714,GO:0055114
Bany_Scaf1  maker   mRNA    201136  207903  .   +   .   ID=Bany_03723-RA;Parent=Bany_03723;Alias=maker-Bany_Scaf1-snap-gene-2.23-mRNA-1;Dbxref=InterPro:IPR019774,Pfam:PF00351;Name=Bany_03723-RA;Ontology_term=GO:0016714,GO:0055114;_AED=0.06;_QI=45|1|1|1|1|1|7|425|530;_eAED=0.06
Bany_Scaf1  maker   exon    201136  201304  .   +   .   ID=Bany_03723-RA:1;Parent=Bany_03723-RA
Bany_Scaf1  maker   exon    202687  202770  .   +   .   ID=Bany_03723-RA:2;Parent=Bany_03723-RA
Bany_Scaf1  maker   exon    202886  202921  .   +   .   ID=Bany_03723-RA:3;Parent=Bany_03723-RA
Bany_Scaf1  maker   exon    203004  203820  .   +   .   ID=Bany_03723-RA:4;Parent=Bany_03723-RA
Bany_Scaf1  maker   exon    206097  206223  .   +   .   ID=Bany_03723-RA:5;Parent=Bany_03723-RA
Bany_Scaf1  maker   exon    206649  206878  .   +   .   ID=Bany_03723-RA:6;Parent=Bany_03723-RA
Bany_Scaf1  maker   exon    207304  207903  .   +   .   ID=Bany_03723-RA:7;Parent=Bany_03723-RA
Bany_Scaf1  maker   CDS 201181  201304  .   +   0   ID=Bany_03723-RA:cds;Parent=Bany_03723-RA
Bany_Scaf1  maker   CDS 202687  202770  .   +   2   ID=Bany_03723-RA:cds;Parent=Bany_03723-RA
Bany_Scaf1  maker   CDS 202886  202921  .   +   2   ID=Bany_03723-RA:cds;Parent=Bany_03723-RA
Bany_Scaf1  maker   CDS 203004  203820  .   +   2   ID=Bany_03723-RA:cds;Parent=Bany_03723-RA
Bany_Scaf1  maker   CDS 206097  206223  .   +   1   ID=Bany_03723-RA:cds;Parent=Bany_03723-RA
Bany_Scaf1  maker   CDS 206649  206878  .   +   0   ID=Bany_03723-RA:cds;Parent=Bany_03723-RA
Bany_Scaf1  maker   CDS 207304  207478  .   +   1   ID=Bany_03723-RA:cds;Parent=Bany_03723-RA
Bany_Scaf1  maker   five_prime_UTR  201136  201180  .   +   .   ID=Bany_03723-RA:five_prime_utr;Parent=Bany_03723-RA
Bany_Scaf1  maker   three_prime_UTR 207479  207903  .   +   .   ID=Bany_03723-RA:three_prime_utr;Parent=Bany_03723-RA

How can I add the missing double quotes?

rna-seq agat gff gff3 gtf • 185 views
ADD COMMENTlink modified 4 months ago by Juke344.8k • written 4 months ago by tianshenbio50

AGAT should produce a correct GTF file. Is there anything wrong with your GFF? A new version of AGAT was released recently and a few issues were fixed. Try to update it.

ADD REPLYlink modified 4 months ago • written 4 months ago by alex.zaccaron180

Added my gff in the post. My gff file was checked and correct by AGAT already, and I am using v0.3.0

ADD REPLYlink written 4 months ago by tianshenbio50
0
gravatar for Juke34
4 months ago by
Juke344.8k
Sweden
Juke344.8k wrote:

Hi, thank you for pointing it, I had forgot about it! The problem is related to Bioperl see here. I have a patch to fix the problem in Bioperl but I was waiting some feedbacks. I will try to include the necessary changes specifically in the agat_convert_sp_gff2gtf.pl script. It should be fixed in a next release.

ADD COMMENTlink written 4 months ago by Juke344.8k

Hi, thank you for your updates! I managed to fix it in Linux, hope this can be fixed in the next release.

ADD REPLYlink written 4 months ago by tianshenbio50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1010 users visited in the last hour