Question: How to chromosome length within a .gff3 file?
gravatar for readeatsleep
10 months ago by
readeatsleep0 wrote:

Hello! I am super new at this and I am trying to determine the length of all my chromosomes within a .gff3 file.

I have tried:

cut -f1 [file].gff3 | sort -k1,1

But, of course that only gives me the chromosome number in order.

cut -f1 -f4 -f5 [FILE].gff3 | sort -k1,1

But, I get an error when doing this. I believe (I am also new at .gff3 files) that F1 is my chromosome number, F4 is my start position and F5 is my stop position.

Could anyone help me?

sequence gff3 • 214 views
ADD COMMENTlink modified 10 months ago by h.mon30k • written 10 months ago by readeatsleep0
gravatar for h.mon
10 months ago by
h.mon30k wrote:

The most common use of gff3 files is to encode features pertaining to a genome annotation, such as gene, three_prime_UTR, mRNA, exon, and so on. A gff3 can encode additional features, such as "chromosome", but it is uncommon. So, to get the length of all chromosomes, you would need to select chromosome on the third column, and the end position - the start position of a chromosome will always be 1. Something like this should work:

awk '{ if($3 == "gene") print $1"\t"$5 }' file.gff3
ADD COMMENTlink written 10 months ago by h.mon30k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1495 users visited in the last hour