Question: Convertion Of Gff3 To Gtf
gravatar for geek_y
8.8 years ago by
geek_y11k wrote:

How do I convert GFF file to a GTF file? Is there any tool available ?    

gff gtf • 69k views
ADD COMMENTlink modified 11 months ago by Juke345.2k • written 8.8 years ago by geek_y11k

It would help to know what your downstream analysis/usage is? If GTF is an intermediate step towards another conversion, I suggest you try to obtain directly the final format. From seqanswers:

The whole point of the GTF format was to standardise certain aspects that are left open in GFF. Hence, there are many different valid ways to encode the same information in a valid GFF format, and any parser or converter needs to be written specifically for the choices the author of the GFF file made. For example, a GTF file requires the gene ID attribute to be called "gene_id", while in GFF files, it may be "ID", "Gene", something different, or completely missing. Hence, a general GFF-to-GTF converter (as opposed to one converting only GFF files from a very specific source) needs to guess this from the data, which is non-trivial.

In general, it is difficult to get this right unless you are working on 1 particular GFF file as GFF is more general than GTF.

ADD REPLYlink modified 15 months ago by Ram32k • written 8.8 years ago by Arun2.4k

in gerneral, gtf is a subset of gff that is used often for counting peaks in RNA-seq data, it would be very useful if you gave more information on what you are trying to do. You can find the specifications for GFF3 here: GTF and GFF

ADD REPLYlink written 8.8 years ago by Ying W4.0k


I'm still looking for a tool that allow to make a conversion from GenBank data to gtf for species that are not in ENSEMBL database. Any suggestions?

ADD REPLYlink written 2.8 years ago by Giuseppe0
gravatar for gleparc
7.2 years ago by
gleparc470 wrote:

The easiest way is to use the gffread program that comes with the Cufflinks software suite (Tuxedo)

gffread my.gff3 -T -o my.gtf

See gffread -h for more information

ADD COMMENTlink written 7.2 years ago by gleparc470

This should be an answer! It worked for me, thanks.

ADD REPLYlink written 6.0 years ago by SES8.4k

gffread from Cufflinks version 2.2.1 not work properly, it leaves only "gene_id" and "transcript_id" from the 9th column. E.g. exon number was stripped. I've not used the latest version, because I couldn't find the binaries and don't want to install additional packages needed for compilation.

ADD REPLYlink written 3.6 years ago by boczniak767700

Could you please expand on this?

I have just used Cufflinks v2.2.1 for gffread. The output GTF file looks fine, checking against specifications stated here:

ADD REPLYlink written 23 months ago by Barry Digby640
gravatar for Paolo
8.8 years ago by
Paolo240 wrote:

Take a look at the rtracklayer Bioconducor package:

test_path <- system.file("tests", package = "rtracklayer")
test_gff3 <- file.path(test_path, "genes.gff3")
test <- import(test_gff3)
ADD COMMENTlink written 8.8 years ago by Paolo240

It helped. Thanks :)

ADD REPLYlink written 3.9 years ago by sangram_keshari250

I'm not sure if this produce really gtf, after above commands I have "gff-version 2" in the header of exported file.

ADD REPLYlink written 3.6 years ago by boczniak767700

The GTF (General Transfer Format) is identical to GFF version 2.

ADD REPLYlink written 3.2 years ago by bhanratt40
gravatar for Juke34
11 months ago by
Juke345.2k wrote:

I made a mini review of existing tools. See here.

ADD COMMENTlink written 11 months ago by Juke345.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 964 users visited in the last hour