Question: How to chromosome length within a .gff3 file?
gravatar for readeatsleep
11 days ago by
readeatsleep0 wrote:

Hello! I am super new at this and I am trying to determine the length of all my chromosomes within a .gff3 file.

I have tried:

cut -f1 [file].gff3 | sort -k1,1

But, of course that only gives me the chromosome number in order.

cut -f1 -f4 -f5 [FILE].gff3 | sort -k1,1

But, I get an error when doing this. I believe (I am also new at .gff3 files) that F1 is my chromosome number, F4 is my start position and F5 is my stop position.

Could anyone help me?

sequence gff3 • 91 views
ADD COMMENTlink modified 11 days ago by h.mon27k • written 11 days ago by readeatsleep0
gravatar for h.mon
11 days ago by
h.mon27k wrote:

The most common use of gff3 files is to encode features pertaining to a genome annotation, such as gene, three_prime_UTR, mRNA, exon, and so on. A gff3 can encode additional features, such as "chromosome", but it is uncommon. So, to get the length of all chromosomes, you would need to select chromosome on the third column, and the end position - the start position of a chromosome will always be 1. Something like this should work:

awk '{ if($3 == "gene") print $1"\t"$5 }' file.gff3
ADD COMMENTlink written 11 days ago by h.mon27k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1886 users visited in the last hour