Question: what is the uniprot RDF entity for "uniprot accession"?
gravatar for mk
4 weeks ago by
mk180 wrote:

All proteins on UniProt have a unique accession number. Ex "O15169" is the accession for human Axin 1.

Other RDF stores referring to proteins on UniProt use this accession (eg Pathway Commons reference)

This document describes the RDF schema for UniProt.

Where is the UniProt accession in this RDF schema?

uniprot rdf • 108 views
ADD COMMENTlink modified 4 weeks ago by me720 • written 4 weeks ago by mk180
gravatar for me
4 weeks ago by
me720 wrote:

In the UniProt RDF model, the accession is only in the IRI of the form${ACCESSION}.

To go from an accession string in pathway commons to a IRI one uses a SPARQL snippet like:

VALUES ?acc { "P05067" }
BIND(IRI(CONCAT("", ?acc)) AS ?entry)

There are two reasons that we don't have the primary accession as a string in our RDF or SPARQL endpoint.

  1. Avoiding false joins, an UniProt accession. Might also be used to identify something completely else, without the IRI part false joins can lead to wrong results.
  2. Adding a string for each identifier adds hundreds of millions of extra triples and strings in the database which will negatively impact performance and storage.
ADD COMMENTlink written 4 weeks ago by me720

thanks for the thorough answer

ADD REPLYlink written 4 weeks ago by mk180
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2051 users visited in the last hour