Question: How to extract coordinates of first exons from .gtf file.
0
gravatar for wanziyi89
3.8 years ago by
wanziyi8950
Singapore, Temasek Life Sciences Laboratory
wanziyi8950 wrote:

Hi All, 

 

Supposed I have a .gtf file with the exons of all genes in a given genome. I would like to extract the first exon coordinate of each gene from the .gtf file. How should I get started?
 

regards,

 

Ziyi

rna-seq exon genome gtf • 2.6k views
ADD COMMENTlink modified 3.8 years ago by geek_y9.9k • written 3.8 years ago by wanziyi8950
3
gravatar for geek_y
3.8 years ago by
geek_y9.9k
Barcelona
geek_y9.9k wrote:

As your question is about getting started, I would suggest to look for code snippets or libraries that parse GTF files to get an idea about handling a GTF file.

For example: 

http://www-huber.embl.de/users/anders/HTSeq/doc/tour.html#tour 

https://github.com/ctokheim/PrimerSeq/blob/master/gtf.py

But a quick and dirty way would be:

curl https://raw.githubusercontent.com/roryk/DEXSeq/master/inst/python_scripts/dexseq_prepare_annotation.py  | python - genes.gtf out.tmp​

grep "exonic_part_number \"001\"" out.tmp | less -S 

This gives all the first exonic parts of a gene, assuming a standard gtf file format.

ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by geek_y9.9k

Thank you! HTseq seems promising and I think I found some leads in the TSS Plot.

ADD REPLYlink written 3.8 years ago by wanziyi8950

I updated my ans. accept it if it works for you.

ADD REPLYlink written 3.8 years ago by geek_y9.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 3566 users visited in the last hour