converting .gff file to .gtf
3
0
Entering edit mode
6.0 years ago
kudzu • 0

I am trying to upload annotation file to the galaxy rna seq work flow, but i have annotation in the .gff format, what is the easiest way to convert .gff file to .gtf file

RNA-Seq • 11k views
ADD COMMENT
2
Entering edit mode

Refer this

ADD REPLY
0
Entering edit mode

Have you tried to search converting .gff file to .gtf? I got several hits, from BioStars, SeqAnswers, ResearchGate... For example, gffread is cited very often, you could give it a try.

ADD REPLY
3
Entering edit mode
4.2 years ago
Juke34 8.6k

Great list from @Jeffin, I would add gffread and agat_convert_sp_gff2gtf.pl from AGAT

I tested the different solution with this gff3 test file, and we can see that results differ from method used:

##gff-version 3
scaffold625 maker   gene    337818  343277  .   +   .   ID=CLUHARG00000005458;Name=TUBB3_2
scaffold625 maker   mRNA    337818  343277  .   +   .   ID=CLUHART00000008717;Parent=CLUHARG00000005458
scaffold625 maker   tss 337915  337918  .   +   .   ID=CLUHART00000008717:tss;Parent=CLUHART00000008717
scaffold625 maker   CDS 337915  337971  .   +   0   ID=CLUHART00000008717:cds;Parent=CLUHART00000008717
scaffold625 maker   CDS 340733  340841  .   +   0   ID=CLUHART00000008717:cds;Parent=CLUHART00000008717
scaffold625 maker   CDS 341518  341628  .   +   2   ID=CLUHART00000008717:cds;Parent=CLUHART00000008717
scaffold625 maker   CDS 341964  343033  .   +   2   ID=CLUHART00000008717:cds;Parent=CLUHART00000008717
scaffold625 maker   exon    337818  337971  .   +   .   ID=CLUHART00000008717:exon1;Parent=CLUHART00000008717
scaffold625 maker   exon    340733  340841  .   +   .   ID=CLUHART00000008717:exon2;Parent=CLUHART00000008717
scaffold625 maker   exon    341518  341628  .   +   .   ID=CLUHART00000008717:exon3;Parent=CLUHART00000008717
scaffold625 maker   exon    341964  343277  .   +   .   ID=CLUHART00000008717:exon4;Parent=CLUHART00000008717
scaffold625 maker   five_prime_utr  337818  337914  .   +   .   ID=CLUHART00000008717:five_prime_utr;Parent=CLUHART00000008717
scaffold625 maker   three_prime_UTR 343034  343277  .   +   .   ID=CLUHART00000008717:three_prime_utr;Parent=CLUHART00000008717

AGAT agat_convert_sp_gff2gtf.pl)

##gtf-version 3
scaffold625 maker   gene    337818  343277  .   +   .   ID CLUHARG00000005458; Name TUBB3_2; gene_id CLUHARG00000005458
scaffold625 maker   transcript  337818  343277  .   +   .   ID CLUHART00000008717; Parent CLUHARG00000005458; gene_id CLUHARG00000005458; original_biotype mrna; transcript_id CLUHART00000008717
scaffold625 maker   exon    337818  337971  .   +   .   ID "CLUHART00000008717:exon1"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; transcript_id CLUHART00000008717
scaffold625 maker   exon    340733  340841  .   +   .   ID "CLUHART00000008717:exon2"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; transcript_id CLUHART00000008717
scaffold625 maker   exon    341518  341628  .   +   .   ID "CLUHART00000008717:exon3"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; transcript_id CLUHART00000008717
scaffold625 maker   exon    341964  343277  .   +   .   ID "CLUHART00000008717:exon4"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; transcript_id CLUHART00000008717
scaffold625 maker   CDS 337915  337971  .   +   0   ID "CLUHART00000008717:cds"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; transcript_id CLUHART00000008717
scaffold625 maker   CDS 340733  340841  .   +   0   ID "CLUHART00000008717:cds"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; transcript_id CLUHART00000008717
scaffold625 maker   CDS 341518  341628  .   +   2   ID "CLUHART00000008717:cds"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; transcript_id CLUHART00000008717
scaffold625 maker   CDS 341964  343033  .   +   2   ID "CLUHART00000008717:cds"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; transcript_id CLUHART00000008717
scaffold625 maker   five_prime_utr  337818  337914  .   +   .   ID "CLUHART00000008717:five_prime_utr"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; transcript_id CLUHART00000008717
scaffold625 maker   three_prime_utr 343034  343277  .   +   .   ID "CLUHART00000008717:three_prime_utr"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; original_biotype three_prime_UTR; transcript_id CLUHART00000008717

gffread

scaffold625 maker   transcript  337818  343277  .   +   .   transcript_id "CLUHART00000008717"; gene_id "CLUHARG00000005458";
scaffold625 maker   exon    337818  337971  .   +   .   transcript_id "CLUHART00000008717"; gene_id "CLUHARG00000005458";
scaffold625 maker   exon    340733  340841  .   +   .   transcript_id "CLUHART00000008717"; gene_id "CLUHARG00000005458";
scaffold625 maker   exon    341518  341628  .   +   .   transcript_id "CLUHART00000008717"; gene_id "CLUHARG00000005458";
scaffold625 maker   exon    341964  343277  .   +   .   transcript_id "CLUHART00000008717"; gene_id "CLUHARG00000005458";
scaffold625 maker   CDS 337915  337971  .   +   0   transcript_id "CLUHART00000008717"; gene_id "CLUHARG00000005458";
scaffold625 maker   CDS 340733  340841  .   +   0   transcript_id "CLUHART00000008717"; gene_id "CLUHARG00000005458";
scaffold625 maker   CDS 341518  341628  .   +   2   transcript_id "CLUHART00000008717"; gene_id "CLUHARG00000005458";
scaffold625 maker   CDS 341964  343033  .   +   2   transcript_id "CLUHART00000008717"; gene_id "CLUHARG00000005458";
ADD COMMENT
3
Entering edit mode
4.2 years ago
Juke34 8.6k

it didn't fit in one post, here the rest:

genome tools

scaffold625 maker   exon    337818  337971  .   +   .   gene_id "1"; transcript_id "1.1";
scaffold625 maker   exon    340733  340841  .   +   .   gene_id "1"; transcript_id "1.1";
scaffold625 maker   exon    341518  341628  .   +   .   gene_id "1"; transcript_id "1.1";
scaffold625 maker   exon    341964  343277  .   +   .   gene_id "1"; transcript_id "1.1";
scaffold625 maker   CDS 337915  337971  .   +   0   gene_id "1"; transcript_id "1.1";
scaffold625 maker   CDS 340733  340841  .   +   0   gene_id "1"; transcript_id "1.1";
scaffold625 maker   CDS 341518  341628  .   +   2   gene_id "1"; transcript_id "1.1";
scaffold625 maker   CDS 341964  343033  .   +   2   gene_id "1"; transcript_id "1.1";

ea-utils

scaffold625 maker   exon    337818  337971  0   +   .   gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717:CLUHARG00000005458";
scaffold625 maker   CDS 337915  337971  0   +   0   gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717:CLUHARG00000005458";
scaffold625 maker   CDS 340733  340841  0   +   0   gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717:CLUHARG00000005458";
scaffold625 maker   exon    340733  340841  0   +   .   gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717:CLUHARG00000005458";
scaffold625 maker   CDS 341518  341628  0   +   2   gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717:CLUHARG00000005458";
scaffold625 maker   exon    341518  341628  0   +   .   gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717:CLUHARG00000005458";
scaffold625 maker   CDS 341964  343033  0   +   2   gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717:CLUHARG00000005458";
scaffold625 maker   exon    341964  343277  0   +   .   gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717:CLUHARG00000005458";

pasa (you need the fasta sequence too)

scaffold625 maker   gene    337818  343277  0   +   .   gene_id "CLUHARG00000005458"; Name "TUBB3_2";
scaffold625 maker   transcript  337818  343277  0   +   .   gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717"; Name "TUBB3_2";
scaffold625 maker   exon    337818  337971  0   +   .   gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717"; Name "TUBB3_2";
scaffold625 maker   CDS 337818  337971  0   +   .   gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717"; Name "TUBB3_2";
scaffold625 maker   exon    340733  340841  0   +   .   gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717"; Name "TUBB3_2";
scaffold625 maker   CDS 340733  340841  0   +   .   gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717"; Name "TUBB3_2";
scaffold625 maker   exon    341518  341628  0   +   .   gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717"; Name "TUBB3_2";
scaffold625 maker   CDS 341518  341628  0   +   .   gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717"; Name "TUBB3_2";
scaffold625 maker   exon    341964  343277  0   +   .   gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717"; Name "TUBB3_2";
scaffold625 maker   CDS 341964  343277  0   +   .   gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717"; Name "TUBB3_2";

kent utils => => I didn't succeed to make it run (on osx)

GFFtools-GX => I didn't succeed to make it run

From the different solutions, some loose attributes information, some do not remove not accepted feature type (3rd colum), some remove accepted feature type for GTF format (see [here][2] for the list of accepted feature type in GTF)...

ADD COMMENT
1
Entering edit mode

macOS versions of Kent utils are available.

ADD REPLY

Login before adding your answer.

Traffic: 1668 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6