Coordinates for genomic features?
1
0
Entering edit mode
3.6 years ago
Ankit ▴ 500

Hi Everyone,

Can anyone help me how to get coordinates of the genome features for hg19? For example genes, Exon, intron, 5'utr and 3'utr , promoters.

For genes and Exon I can get from gtf file. Right? The issue is with other features.

Please suggest.

Thank you

gene utr promoter intron exon • 1.7k views
ADD COMMENT
0
Entering edit mode

Hi

thanks

Good suggestion.

How about UCSC genome browser table https://genome.ucsc.edu/cgi-bin/hgTables

Do you think is it correctly provide desired coordinates?

I stiil dont know about promoter. So I thought to take 200 bp upstream of gene start. Does it make sense for approx promoter site ?

ADD REPLY
1
Entering edit mode

This should be a comment to my original answer and that keeps the organization properly.

I am not entirely sure about the UCSC browser table, but looks like you might be able to retrieve some data that you want.

Sorry, I missed the promoter part of your original question. Ensemble has great resources that can help you get at this. Take a look at this Biostars post and this documentation. Based on these you can download human regulatory build from the FTP site.

If you run this command on this file you can see that it contains more information and this seems like a great resource.

awk '{print $3}' homo_sapiens.GRCh38.Regulatory_Build.regulatory_features.20190329.gff | sort | uniq

CTCF_binding_site
TF_binding_site
enhancer
open_chromatin_region
promoter
promoter_flanking_region

Hope this helps!

ADD REPLY
2
Entering edit mode
3.6 years ago
jkkbuddika ▴ 190

You can get all coordinates using a GTF file downloaded from Ensembl.

Download the GTF file and then run:

awk '{print $3}' Homo_sapiens.GRCh38.101.gtf | sort | uniq

CDS
exon
five_prime_utr
gene
Selenocysteine
start_codon
stop_codon
three_prime_utr
transcript

So if you want to get information about 5'-UTRs, you can run this:

awk '$3 == "five_prime_utr"' Homo_sapiens.GRCh38.101.gtf > 5utr.bed

This should create a bed file that contains 5'-UTR details. Take a look at this Biostars post to find a nice description about how to obtain intronic/intergenic coordinates. Hope this helps.

ADD COMMENT

Login before adding your answer.

Traffic: 3093 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6