UniProtKB - mapping gene name to ID (*_HUMAN ) using python2
1
0
Entering edit mode
3.8 years ago
jg • 0

Hello!

I have a large list of kinase (gene) names extracted from the UniProtKB modified residue section. (e.g. MAPK1, CDK1, SRC, ATM etc.)

I am trying to convert these names to their entry names (ID) to get:

MAPK1: MK01_HUMAN

CDK1: CDK1_HUMAN

SRC: SRC_HUMAN

etc...

The problem is that for ever gene name I get many IDs and I only want the official one (see below) which always seems to be the only one reviewed.

I tried adding 'columns': 'reviewed' and 'organism': 'human' in the params below but it has no effect. I am basically lost!

An example using MAPK1 kinase:

import urllib,urllib2

params = {
'from':'GENENAME',
'to':'ID',
'format':'tab',
'query':'MAPK1',
'columns': 'reviewed'
}

data = urllib.urlencode(params)
request = urllib2.Request(url, data)
contact = "xxxx@outlook.com"
response = urllib2.urlopen(request)

id_list=[]
new_entries=entries.split("\n")
for element in new_entries:
if element=="":
continue
else:
element=element.split("\t")
if "_HUMAN" in element[1]:
id_list.append(element[1])


The final id_list is: ['MK01_HUMAN', 'Q1HBJ4_HUMAN', 'Q499G7_HUMAN']

I am only interested in extracting the 'main' identifier; MK01_HUMAN. Please can anyone help?

uniprot databases parsing python api • 1.6k views
3
Entering edit mode
3.8 years ago
vkkodali_ncbi ★ 3.4k

Try changing your url to https://www.uniprot.org/uniprot/ and params to:

{
'query': 'gene_exact:mapk1 AND organism:homo_sapiens AND reviewed:yes',
'format': 'tab',
'columns': 'id,entry_name,genes'
}

0
Entering edit mode

Thank you so much - it worked. Now I can sleep peacefully after hours of staring at this :)

0
Entering edit mode

@vkkodali I get error like

url = "https://www.uniprot.org/uniprot/" ^ IndentationError: unexpected indent

0
Entering edit mode

Python seems to be complaining about indentation. If you have copy/pasted the code from above, you should make sure the indentation is correct. I don't think you need to indent the entire code block after the import statement.

0
Entering edit mode

@vkkodali I just correct the indent but it runs and does not give any output

0
Entering edit mode

Without more detailed information from you, I cannot be of much help. What have you tried? How did you run the code? Did you run it as a script? Or did you run it at the python interpreter? Did you use the example posted here or did you use your own example? I am not sure what you mean by 'does not give any output'. What were you expecting? The code, as written in the first post, does not output anything. It populates a list called id_list. Did you see anything in that list?