Can you use Python to get download flanking sequences of genes from Ensembl's older releases/assemblies?
2
0
Entering edit mode
6.6 years ago

I have some Ensembl ID's from an older assembly of the chicken genome, and I'm trying to get the promoter sequences for them. Is there a way to pull the sequences from Ensembl in Python with Biomart or something?

python ensembl genome sequence • 2.5k views
ADD COMMENT
2
Entering edit mode
6.6 years ago

You could use Ensembl's BioMart if the version you need is still available online otherwise, you'll need to download it from the ftp site and if the "something" covers perl, I suggest to use Ensembl's perl API.

ADD COMMENT
0
Entering edit mode

I just got it sort of working with Ensembl's REST API, but I have a couple thousand of genes, and it has a max size of 50 per post request :(

ADD REPLY
2
Entering edit mode

Which is why I almost never recommend a REST API. Use the perl API. If you're going to work with Ensembl a lot, the time invested into learning it is well spent.

ADD REPLY
0
Entering edit mode
6.6 years ago

You should probably take a look at pyensembl. This will allow you to pull all the genes for your IDs (along with their chromosome/start sites/strand). Then you can download the Chicken genome as a FASTA file and use pyfaidx to pull the sequence around the start site of each gene from that FASTA file for whatever you want to define as the promoter (2kb upstream of TSS or whatever). Sounds complicated, but it's actually pretty straightforward to implement, and likely quicker than trying to query Ensembl's API directly.

ADD COMMENT

Login before adding your answer.

Traffic: 2023 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6