Question: Script for getting summary statistic of any genome using GTF or GFF3 ?
gravatar for vahapel
5.4 years ago by
vahapel190 wrote:

Hi All,

Nowadays, I am looking for a script for obtaining summary statistics such as; transcript number, base numbers, length, intron length etc., using GTF/GFF3 file(s) and genome ? 

Thank you for all your help !

sequence gene genome • 5.1k views
ADD COMMENTlink modified 5.4 years ago by Juke345.1k • written 5.4 years ago by vahapel190
gravatar for Juke34
5.4 years ago by
Juke345.1k wrote:

There are several solutions for that: _*Updated to put everything in one place_

  • In Perl I use from the gff toolkit AGAT. See here for an example sample of the output. This solution has advantage to work with any kind of GTF/GFF flavor (even not sorted and with errors).

  • In Python GAG is a good solution for that purpose: From a directory where you have your genome (genome.fasta) and your annotation (genome.gff), you launch GAG, then you load the files by typing "load" (by default it will look for genome.fasta and genome.gff), and finaly you type "info" and you will have a complete summary statistics of your annotation. It works perfectly fine with gff3 format.

  • In Perl+bash there is GFF-Ex, when I tried it, it din't work for me. (Maybe due to the specific gff flavour I was using)

  • In Bash using awk or grep commands

  • There are solutions in R, see here for an example.

  • Using GenomeTools with the command gt stat

  • bedtools

  • gffutils

Related posts:
A: Analysis gff3 file
Plot statistics from gtf/gff file

ADD COMMENTlink modified 15 months ago • written 5.4 years ago by Juke345.1k

Hi Juke-34, thank you for introducing "GAG" to me, it is perfectly suited for the project.

ADD REPLYlink written 5.4 years ago by vahapel190
gravatar for h.mon
5.4 years ago by
h.mon32k wrote:

bedtools probably does a lot of what you want, have a look at its documentation and usage examples.

ADD COMMENTlink written 5.4 years ago by h.mon32k

Dear h.mon,

BedTools is perfect in many aspect, during my search for gff3 parsing, I encountered some very useful tools;

It can also be useful for gff (probaly works for gtf if small changes made) parsing.

ADD REPLYlink written 5.4 years ago by vahapel190
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2634 users visited in the last hour