Question: UniProtKB - mapping gene name to ID (*_HUMAN ) using python2
9 weeks ago
jg0 wrote:


I have a large list of kinase (gene) names extracted from the UniProtKB modified residue section. (e.g. MAPK1, CDK1, SRC, ATM etc.)

I am trying to convert these names to their entry names (ID) to get:





The problem is that for ever gene name I get many IDs and I only want the official one (see below) which always seems to be the only one reviewed.

I tried adding 'columns': 'reviewed' and 'organism': 'human' in the params below but it has no effect. I am basically lost!

An example using MAPK1 kinase:

import urllib,urllib2

    url = ''

    params = {
    'columns': 'reviewed'

    data = urllib.urlencode(params)
    request = urllib2.Request(url, data)
    contact = "" 
    request.add_header('User-Agent', 'Python %s' % contact)
    response = urllib2.urlopen(request)
    header = response.readline()

    for element in new_entries:
        if element=="":
            if "_HUMAN" in element[1]:

The final id_list is: ['MK01_HUMAN', 'Q1HBJ4_HUMAN', 'Q499G7_HUMAN']

I am only interested in extracting the 'main' identifier; MK01_HUMAN. Please can anyone help?

ADD COMMENT
9 weeks ago
United States
vkkodali960 wrote:

Try changing your url to and params to:

    'query': 'gene_exact:mapk1 AND organism:homo_sapiens AND reviewed:yes', 
    'format': 'tab', 
    'columns': 'id,entry_name,genes'
ADD COMMENT

Thank you so much - it worked. Now I can sleep peacefully after hours of staring at this :)

ADD REPLY

@vkkodali I get error like

url = "" ^ IndentationError: unexpected indent

ADD REPLY

Python seems to be complaining about indentation. If you have copy/pasted the code from above, you should make sure the indentation is correct. I don't think you need to indent the entire code block after the import statement.

ADD REPLY

@vkkodali I just correct the indent but it runs and does not give any output

ADD REPLY

Without more detailed information from you, I cannot be of much help. What have you tried? How did you run the code? Did you run it as a script? Or did you run it at the python interpreter? Did you use the example posted here or did you use your own example? I am not sure what you mean by 'does not give any output'. What were you expecting? The code, as written in the first post, does not output anything. It populates a list called id_list. Did you see anything in that list?

ADD REPLY
