Question: Convert Official Gene Symbols To Uniprot Ids In Python ?
2
gravatar for hicsuntdrac0nis
6.0 years ago by
hicsuntdrac0nis220 wrote:

Is there a package in python or way to convert official gene symbol names to uniprot accession numbers ? I know of converters online but basically I generated a program where you input a list of uniprot symbols . . . but originally, I have official gene symbols that I need to first convert .

ADD COMMENTlink modified 6.0 years ago by pld4.8k • written 6.0 years ago by hicsuntdrac0nis220
1
gravatar for Elisabeth Gasteiger
6.0 years ago by
Geneva
Elisabeth Gasteiger1.7k wrote:

You might find some hints in this UniProt FAQ: http://www.uniprot.org/faq/53 - "Can I convert gene symbols to UniProtKB identifiers? How can I map UniProtKB IDs or ACs to gene symbols?"

See also the FAQ about programmatic access: http://www.uniprot.org/faq/28 - "How can I access resources on this web site programmatically?"

ADD COMMENTlink written 6.0 years ago by Elisabeth Gasteiger1.7k
1
gravatar for pld
6.0 years ago by
pld4.8k
United States
pld4.8k wrote:

No there is no module native to python (that I know of) that supports this function. You can use python to interface with some online tools (such as Uniprot's converter), or access ensembl with python driven MySQL. If you can assume that the user will be working with one or more specific species you can download data from an associated database. The problem with this process is that if you are unable to have curated data, say a file saying symbol x is UPKB y, you have to rely on searching which can get tricky. I've used Uniprot's ID mapper and it is a soft search tool, you don't always land where you want to.

You could use biopython to search with NCBI's entrez but this can be a mess because you now have to filter for species, isoforms and other things. It is also significantly slower given the rate limits on how many queries you can send to NCBI. Generally any webtool is going to not like being hit with massive numbers of queries in a short period of time.

I suggest sticking with the list of Uniprot IDs unless you can have some form of backend to store accession number to symbol links.

ADD COMMENTlink written 6.0 years ago by pld4.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 833 users visited in the last hour