Question: Where to get Gene start position to end positon for all gene in an organism
0
gravatar for Bioblazer
4.5 years ago by
Bioblazer50
Pune
Bioblazer50 wrote:

In a transcript file (.gtf) we can able to get exon wise start position and stop position, CDS and start codon region and stop codon region,but I want start position and end position of all genes (not exon wise) with its chromosome number in an organism.

genome gene • 2.1k views
ADD COMMENTlink modified 4.5 years ago by EagleEye6.7k • written 4.5 years ago by Bioblazer50
3
gravatar for Devon Ryan
4.5 years ago by
Devon Ryan97k
Freiburg, Germany
Devon Ryan97k wrote:

Normally there's a "gene" entry for each gene, so:

awk 'BEGIN{FS="\t"; OFS="\t"}{if($3 == "gene") print $1, $4, $5}' foo.gtf
ADD COMMENTlink modified 4.5 years ago • written 4.5 years ago by Devon Ryan97k

Thank you very much for your kind reply

ADD REPLYlink written 4.5 years ago by Bioblazer50
2
gravatar for EagleEye
4.5 years ago by
EagleEye6.7k
Sweden
EagleEye6.7k wrote:

As Devon said if the GTF file is from Gencode, you will have "gene" entry. Extracting "gene" entry will give you the desired results. If you use Ensembl GTF annotation (hg19 upto GRCh37.74), it does not have "gene" entry. In this case use the following script.

The output will look similar to

enter image description here

ADD COMMENTlink modified 4.5 years ago • written 4.5 years ago by EagleEye6.7k

Thank you so much for your valuable reply.

ADD REPLYlink written 4.5 years ago by Bioblazer50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1324 users visited in the last hour