Question: Get Uniprot entry name from PDB ID and chain (solved)
0
gravatar for albert.castella.teruel
3.8 years ago by
Spain

Hello every one,

I'm working with a file with a large number of PDB identifiers and for each of the identifiers I have two or more chains. What I want is to get a list with the Uniprot entry name corresponding to each of the chains.

I want an automatic way to iterate over a file in python to incorporate in an script that I have.

If someone can help I will thank a lot

Albert

python • 2.5k views
ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by albert.castella.teruel0
1
gravatar for zlira
3.8 years ago by
zlira80
Ukraine/L'viv
zlira80 wrote:

There is such an endpoint at pdb: http://www.rcsb.org/pdb/software/rest.do see the part on "Third-party annotations and PDB to UniProtKB mapping".

Example mapping for 4hhb.A chain: http://www.rcsb.org/pdb/rest/das/pdb_uniprot_mapping/alignment?query=4hhb.A

You can make a request to that endpoin and parse out Uniprot accession id of a chain from xml that is returned. If you need any additional info (like protein name on Unirpot) you can use the accession id to do that.

Here is some sample code:

import requests
from xml.etree.ElementTree import fromstring

pdb_id = '4hhb.A'
pdb_mapping_url = 'http://www.rcsb.org/pdb/rest/das/pdb_uniprot_mapping/alignment'
uniprot_url = 'http://www.uniprot.org/uniprot/{}.xml'

def get_uniprot_accession_id(response_xml):
    root = fromstring(response_xml)
    return next(
        el for el in root.getchildren()[0].getchildren()
        if el.attrib['dbSource'] == 'UniProt'
    ).attrib['dbAccessionId']

def get_uniprot_protein_name(uniport_id):
    uinprot_response = requests.get(
        uniprot_url.format(uniport_id)
    ).text
    return fromstring(uinprot_response).find(
        './/{http://uniprot.org/uniprot}recommendedName/{http://uniprot.org/uniprot}fullName'
    ).text

def map_pdb_to_uniprot(pdb_id):
    pdb_mapping_response = requests.get(
        pdb_mapping_url, params={'query': pdb_id}
    ).text
    uniprot_id = get_uniprot_accession_id(pdb_mapping_response)
    uniprot_name = get_uniprot_protein_name(uniprot_id)
    return {
        'pdb_id': pdb_id,
        'uniprot_id': uniprot_id,
        'uniprot_name': uniprot_name
    }

print map_pdb_to_uniprot(pdb_id)

Result:

{'pdb_id': '4hhb.A', 'uniprot_id': 'P69905', 'uniprot_name': 'Hemoglobin subunit alpha'} 
ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by zlira80

Wow that's amazing but is not what i really want. The uniprot entry name that im looking for has this format: HBA_HUMAN

EDIT: Thanks for the help

EDIT2: I solved the problem like this:

def get_uniprot_protein_name(uniport_id):
    uinprot_response = requests.get(
        uniprot_url.format(uniport_id)
    ).text
    return fromstring(uinprot_response).find(
        './/{http://uniprot.org/uniprot}entry/{http://uniprot.org/uniprot}name'
    ).text
ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by albert.castella.teruel0

You can get that name by changing line 22 from:

'.//{http://uniprot.org/uniprot}recommendedName/{http://uniprot.org/uniprot}fullName'

to: 

'.//{http://uniprot.org/uniprot}name'
ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by zlira80

Yep is more or less what i have done. I just want to thank you for your help (In this and other posts)

ADD REPLYlink written 3.8 years ago by albert.castella.teruel0

Hi i'm having a problem right now with the script.

The problem is the next:

When having as input a PDB code plus a chain, for instance 2VLJ.E on the PDB website appears to have a chain but in some cases the pdb chain is not linked to any Uniprot Entry Name.

I would like to know how to ignore the cases when a chain have no hits or to raise a warning without killing the program

ADD REPLYlink written 3.7 years ago by albert.castella.teruel0

Hi..! Thank you for such a nice code to get UniProt. Could you please edit the function to get the Organism of a given chain?

ADD REPLYlink written 3 months ago by Hydrogen Bond0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1713 users visited in the last hour