Question: Mapping Uniprot Proteome identifiers to a taxonomic tree
3.4 years ago
European Union
Patrick20 wrote:

I'm trying to map the presence of three different proteins over the bacterial kingdom. First I thought of using pfam for this purpose, but two proteins end up in very messy DUFs (Domein of Unknown Function). So I now ran three BLAST queries on the Representative Proteomes ( and put an e-value threshold of 10-5 in place. 
This results for each protein in a BLAST table that contains both an UniProt Proteome Identifier and a NCBI Taxonomy identifiers. 
Now I want to map back these BLAST results to either a Uniprot taxonomy or NCBI taxonomy. For the NCBI taxonomy there is a tool available however does a similar taxonomy tool exist for the Uniprot as well? Preferably I would supply it with a list of Uniprot Proteome identifiers and it would return a tree (in for example) phylip format. This could then be visualized using iTOL. I found a neat translation between Uniprot and the NCBI taxonomy ( but I was wondering if a Uniprot taxonomy exist as well?



3.4 years ago
Elisabeth Gasteiger wrote:

In reply to your last question:

The taxonomy database that is maintained by the UniProt group (  is based on the NCBI taxonomy database, which is supplemented with data specific to the UniProt Knowledgebase (UniProtKB). While the NCBI taxonomy is updated daily to be in sync with GenBank/EMBL-Bank/DDBJ, the UniProt taxonomy is updated only at UniProt releases to be in sync with UniProtKB. It may therefore happen that for the time period of a UniProt release, you can find new taxa at the NCBI that are not yet in UniProt (and vice versa for deleted taxa).

For more details see


Regarding a mapping of proteome identifiers to tax_ids, you can query the "Proteomes" section of the UniProt website and then download the results in tab-delimited format, e.g. for non-redundant bacterial proteomes (limited to the first 10 hits):[2]%22%20redundant:no&fil=&limit=10&force=no&preview=true&format=tab&columns=id,organism-id

Proteome ID Organism ID
UP000000579 71421
UP000000558 83334
UP000000625 83333
UP000002524 243230
UP000000798 224324
UP000012042 1001583
UP000001258 272558
UP000001807 224326
UP000001570 224308
UP000002519 83334

written 3.4 years ago by Elisabeth Gasteiger
