Retrieve Genomic Ranges From Ensembl Genomes Using Gene Names
2
1
Entering edit mode
12.2 years ago
Anima Mundi ★ 2.9k

Hello, I would like to extract the genomic coordinates from a series of genes using a common substring of their gene names. For example, the method should output, for the name "tubulin", the ranges (i.e. in BED format) of all the genes in the genome (i.e. mouse) containing "tubulin" (i.e. alpha, beta, gamma). How could I proceed?

ensembl genome coordinates bed • 2.8k views
ADD COMMENT
2
Entering edit mode
12.2 years ago
Leszek 4.2k

Have you tried biomart? I think this is what you are looking for:)

ADD COMMENT
0
Entering edit mode

Thank you too, Leszek.

ADD REPLY
2
Entering edit mode
12.2 years ago

I would try either:

  • BioMart: asking to get all mouse genes (+Description+Start+End), then parse the output to create a BED format containing only 'tubulin' genes with their positions on the genome.

or

  • UCSC Genome Browser: ask to output all mouse genes (say RefSeq Genes), then retrieve complete annotations corresponding to all accession numbers (http://www.ncbi.nlm.nih.gov/sites/batchentrez). Parse both these files to create your final BED file.
ADD COMMENT
0
Entering edit mode

This is what I did, thank you!

ADD REPLY

Login before adding your answer.

Traffic: 3642 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6