Vector base gene id to NCBI ID
0
0
Entering edit mode
4.8 years ago
BioPerson • 0

I've got a large number of gene IDs from Vector Base (ex: AAEL006343-PA and AAEL001710-PA). Some of these IDs have record in NCBI Protein database.

I'm trying to use Biopython to get gene description and other info from NCBI using the following code (for simplicity I've put one id, but would normally do it as a list)

from Bio import Entrez

Entrez.email=emailhere

handle=Entrez.efetch(db="protein", id='AAEL006343-PA', rettype="gb", retmode="text")

records = Entrez.read(handle)

efetch fails with due to HTTP error Bad Request. I know that data does exist because using id 108877864 i get the result I want. However, 108877864 is the NCBI's own ID for this protein. The only way I found to convert AAEL006343-PA to 108877864 is via esearch, but I don't want to spam NCBI with hundreds of esearch queries.

Is there a way to do this ID conversion as a batch and without esearch?

software error • 1.5k views
ADD COMMENT
0
Entering edit mode

You would not spam NCBI as long as you sign up for NCBI_API_KEY and build in an appropriate delay in your queries.

ADD REPLY
0
Entering edit mode

I can do that and loop over 1000+ search calls, but surely there is a better and also quicker way to do this?

ADD REPLY
0
Entering edit mode

Perhaps you could download one of the annotation files from Vector Base and grep the info you need from it?

ADD REPLY
0
Entering edit mode

The source of IDs I'm using are from Vector Base basefeatures GFF. I've had another look on their website, but I can't find a file that would provide actual description of the gene apart from GO for some features.

ADD REPLY
0
Entering edit mode

You can use a combination of esearch and efetch using the "history server", or this documentation from esearch and efetch - in perl with WebEnv=<webenv string>&usehistory=y so the python equivalent must be similar. I remember, esearch returns the WebEnv string you then need to use in efetch

Edit - Genomax solution seems much simpler for a one off execution

ADD REPLY

Login before adding your answer.

Traffic: 3540 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6