Question: UniProt references to SwissProt references?
0
gravatar for mafernandez
2.4 years ago by
Madrid, Spain
mafernandez0 wrote:

Hello there

I have some identifiers coming from a search against the Swissprot database that have the following structure (example 1): SYDND_PSEFS

And I want them to be in the UniProt format, just like the following (example 2):

I4L7P1_9PSED

But I am not able to achieve it through the RetrieveID/ mapping tool, since I do not know the name assigned to the Swissprot database. If doing it the other way (from UniProt to Swissprot) is possible, I am also very interested in how to do it.

Thanks a lot

ADD COMMENTlink modified 2.4 years ago by Elisabeth Gasteiger1.7k • written 2.4 years ago by mafernandez0
2
gravatar for mobiusklein
2.4 years ago by
mobiusklein160
United States
mobiusklein160 wrote:

Uniprot's HTTP API is very accommodating about translating identifiers.

Requesting http://www.uniprot.org/uniprot/SYDND_PSEFS will be redirected to http://www.uniprot.org/uniprot/C3JYT1. If you're comfortable with Python, you could use the following approach:

from lxml import etree

uri_template = "http://www.uniprot.org/uniprot/{0}.xml"
nsmap = {"up": "http://uniprot.org/uniprot"}

your_ids = # load your id list here
translated = []

for swiss_id in your_ids:
    tree = etree.parse(uri_template.format(swiss_id)
    names = [el.text for el in tree.findall(
        ".//up:protein/*/up:fullName", nsmap)]
    recommended_name_tag = tree.find(
        ".//up:protein/*/up:recommendedName", nsmap)
    if recommended_name_tag is not None:
        if recommended_name_tag.text.strip():
            recommended_name = recommended_name_tag.text.strip()
        else:
            recommended_name = ' '.join(c.text for c in recommended_name_tag)
    else:
        try:
            recommended_name = names[0]
        except IndexError:
            recommended_name = ""
    gene_name_tag = tree.find(".//up:entry/up:name", nsmap)
    if gene_name_tag is not None:
        gene_name = gene_name_tag.text
    else:
        gene_name = ""

    translated.append((names, recommended_nam, gene_name))

This will collect all the names that UniProt has for that symbol and store them in the list translated, you can then iterate over you_ids and translated in parallel with zip and decide which identifier to retain.

ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by mobiusklein160
1
gravatar for Elisabeth Gasteiger
2.4 years ago by
Geneva
Elisabeth Gasteiger1.7k wrote:

First of all, a short note on terminology.

The UniProt Knowledgebase (UniProtKB) consists of 2 section: UniProtKB/Swiss-Prot for reviewed entries and UniProtKB/TrEMBL for unreviewed entries (see http://www.uniprot.org/help/uniprotkb_sections, http://www.uniprot.org/help/entry_status).

Since Swiss-Prot is part of UniProtKB, it does not make sense to map from Swiss-Prot to UniProtKB. If an entry is in UniProtKB/Swiss-Prot, it has been reviewed, while a UniProtKB/TrEMBL entry is not reviewed, but in both cases, entries have a UniProtKB identifier (accession number and entry name).

However, if your goal is to map from entry name to accession number, you can indeed use the IDmapping tool http://www.uniprot.org/uploadlists, map from UniProtKB to UniProtKB, and then download the results in "List" format. Or you can use our REST API to map the identifiers one at a time, with an URL of the form

http://www.uniprot.org/uniprot/?query=SYDND_PSEFS&format=list

ADD COMMENTlink written 2.4 years ago by Elisabeth Gasteiger1.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2344 users visited in the last hour