Question: retrieving transmembrane annotations from uniprot SPARQL
2
gravatar for Andrew Su
7 days ago by
Andrew Su4.8k
San Diego, CA
Andrew Su4.8k wrote:

From the UniProt page for Traf3ip3, I can see that this protein is single-pass type IV membrane protein with a note that this annotation was "manual assertion inferred by curator". (The API actually also reports the specific evidence code ECO:0000305.) Can I get this information from the UniProt SPARQL endpoint? I don't see it in the UniProt RDF diagram, but I hear that this information is available.

sparql uniprot • 149 views
ADD COMMENTlink modified 7 days ago • written 7 days ago by Andrew Su4.8k
2
gravatar for me
7 days ago by
me500
Switzerland
me500 wrote:

First find out the IRI for "single-pass type IV membrane protein" in the subcellular location vocabulary used in uniprot. In this case "http://purl.uniprot.org/locations/9908".

As it subcellular location annotations are structured we need to find the exact placing in the topology.

?subcellularLocationAnnotation up:locatedIn  ?locatedIn .
?locatedIn                     up:topology   location:9908 .

Then you need to find the evidence for it. In UniProt RDF we have decided to add evidences using rdf reification and attributions.

[]            rdf:subject     ?locatedIn ;
              rdf:predicate   up:topology ;   
              rdf:object      location:9908 ;
              up:attribution  ?attribution .
?attribution  up:evidence     <http://purl.obolibrary.org/obo/ECO_0000305>

Taking this together we end up with the full query

PREFIX up:<http://purl.uniprot.org/core/> 
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX location:<http://purl.uniprot.org/locations/>
SELECT 
*
WHERE {
    ?entry     up:annotation/up:locatedIn  ?locatedIn 
    ?locatedIn up:topology    location:9908 .
    []         rdf:subject    ?locatedIn ;
               rdf:predicate  up:topology ;   
               rdf:object     location:9908 ;
               up:attribution/up:evidence  <http://purl.obolibrary.org/obo/ECO_0000305>.
}

While in this case location:9908 does not have more specific child terms it is useful to expand the query for those cases. e.g. if you where looking for "Single-pass membrane protein"

{
  BIND(location:9904 as ?location)
 } UNION {
  ?location rdfs:subClassOf location:9904 .
}
?locatedIn up:topology ?location.
ADD COMMENTlink written 7 days ago by me500
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1311 users visited in the last hour