How to chromosome length within a .gff3 file?
1
0
Entering edit mode
4.6 years ago

Hello! I am super new at this and I am trying to determine the length of all my chromosomes within a .gff3 file.

I have tried:

cut -f1 [file].gff3 | sort -k1,1

But, of course that only gives me the chromosome number in order.

cut -f1 -f4 -f5 [FILE].gff3 | sort -k1,1

But, I get an error when doing this. I believe (I am also new at .gff3 files) that F1 is my chromosome number, F4 is my start position and F5 is my stop position.

Could anyone help me?

sequence GFF3 • 1.3k views
ADD COMMENT
0
Entering edit mode
4.6 years ago
h.mon 35k

The most common use of gff3 files is to encode features pertaining to a genome annotation, such as gene, three_prime_UTR, mRNA, exon, and so on. A gff3 can encode additional features, such as "chromosome", but it is uncommon. So, to get the length of all chromosomes, you would need to select chromosome on the third column, and the end position - the start position of a chromosome will always be 1. Something like this should work:

awk '{ if($3 == "gene") print $1"\t"$5 }' file.gff3
ADD COMMENT

Login before adding your answer.

Traffic: 2084 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6