Question: Want to download the proximal promoter region of my favorite gene from every species in Ensembl
0
gravatar for TEF
9 days ago by
TEF0
Bodø, Norway
TEF0 wrote:

Are there any ready-to-use scripts available that would allow me to download a ~2000 bp region immediately upstream from the 5-prime region of any given gene using Biomart. I would like to automate the process such that I can target the proximal promoter of my favorite gene in all the species in Ensemble. I suppose the place to begin is to install Biomart on my puter?

Any words of advice?

Cheers

TEF

ADD COMMENTlink modified 7 days ago by Emily_Ensembl16k • written 9 days ago by TEF0
2
gravatar for Emily_Ensembl
7 days ago by
Emily_Ensembl16k
EMBL-EBI
Emily_Ensembl16k wrote:

I would probably use the REST API rather than BioMart. There's an online course Jupyter notebooks to get you started with REST in Python, Perl or R.

Starting with your favourite gene, use the homology endpoint to get all the orthologues. You can then pull out the Ensembl ID for each and use the lookup endpoint to pull out the coordinates, which you can then use to do some arithmetic to get the upstream region coordinates, which you can put into the sequence region endpoint. Alternatively, you could just use your Ensembl ID with the sequence ID endpoint and get genomic sequence with expand_5prime, but that would also get you the genomic region of the gene.

ADD COMMENTlink written 7 days ago by Emily_Ensembl16k
0
gravatar for Nicolas Rosewick
9 days ago by
Belgium, Brussels
Nicolas Rosewick6.7k wrote:

In R using biomaRt : check here for a similar example : https://bioconductor.org/packages/release/bioc/vignettes/biomaRt/inst/doc/biomaRt.html#retrieve-all-5-utr-sequences-of-all-genes-that-are-located-on-chromosome-3-between-the-positions-185514033-and-185535839

ADD COMMENTlink written 9 days ago by Nicolas Rosewick6.7k

I would like to automate the process such that I can target the proximal promoter of my favorite gene in all the species in Ensemble

Part of the way to what OP is asking for.

TEF : BioMart works on a species at a time (AFAIK). You will need to the coordinates up-front to loop through more than one species.

ADD REPLYlink modified 9 days ago • written 9 days ago by genomax57k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1683 users visited in the last hour