Retrieving Features From Certain Position On Genome
10.0 years ago
vaushev ▴ 10

I would be grateful if anyone can point me to the most appropriate way of querying some genome database (NCBI's, UCSC's - whatever) in order to get features located on my position of interest. By features, I mean I want to know if it is exon, intron, UTR or something else. By far, I tried to query the UCSC's DAS with something like this: http://genome.ucsc.edu/cgi-bin/das/hg19/features?segment=10:7866279,7866281;type=refGene - this gives me some info but not exactly the features that I need...

10.0 years ago
brentp 24k

you can do this with cruzdb in python:

from cruzdb import Genome
[x.features(7866279, 7866281) for x in Genome('hg19').bin_query('refGene', 'chr10', 7866279, 7866281)]


will print:

[['intron', 'cds']]


meaning that it covers an intron and a CDS. If you pull up this region in the genome browser you can see that it spans an intron-exon junction.

10.0 years ago
biorepine ★ 1.5k

Try Galaxy

2. Select your desired genome assembly and paste the co-ordinates using "region" - "position" options.
3. Click "get output"
4. You can see "Create one BED record per:" option. This allows to extarct UTR, exotic and intronic sequences of your input.
I think you should be more specific than that - perhaps indicating the workflow and tools that would solve the problem

well, they seem to have huge amount of tools - could you give a bit more details? Thanks!

0
It is easier to read galaxy basic tutorial than writing up all the steps. I would suggest you to go through the basic galaxy tutorial. Anyway I updated my answer.

10.0 years ago
ff.cc.cc ★ 1.3k

UCSC table browser (selecting Known genes table) lets you define detailed regions to check for the presence of exons introns and cds