Evidence of protein topology via UniProt SPARQL
1
2
Entering edit mode
7.3 years ago
chrs.bock ▴ 20

Hi there!

I am trying to get data through UniProt's SPARQL Endpoint. I am wondering if I can access information about the evidence of a specific region (e.g. "By similarity" in Description column of image).

I tried to use ?annotation a up:Transmembrane_Annotation . but only get the comment and a range. If there is no way, would it be possible to do something similar using their REST API?

Thanks, Christian

Screenshot

UniProt SPARQL Transmembrane Proteins Data Science • 2.0k views
ADD COMMENT
1
Entering edit mode

These urls may be useful:

https://www.google.ru/#newwindow=1&q=uniprot+sparql+query+and+topology

See this one and some others as well – a lot of them talks about topology.

SPARQL playground

http://sparql-playground.nextprot.org/

(see http://sparql-playground.nextprot.org/help/doc/about to use it)

and this one

neXtProt simple search system (use tags)

See https://snorql.nextprot.org/help/doc/introduction

These are Python applications:

Bioservice documentation: https://pythonhosted.org/bioservices/_modules/bioservices/uniprot.html

this link is about the same as above: https://github.com/cokelaer/bioservices/blob/master/src/bioservices/uniprot.py

ADD REPLY
0
Entering edit mode

neXtProt != UniProt and only focuses on human. The UniProt endpoint is at http://sparql.uniprot.org

ADD REPLY
1
Entering edit mode
7.3 years ago
me ▴ 750

In the RDF model of UniProt (for better or worse) evidences are attached using reification. So the query needs

  [] a rdf:Statement ;
     rdf:subject ?protein ; 
     rdf:predicate up:annotation ;
     rdf:object ?annotation ;
     up:attribution ?attribution .

The ?attribution value will have a up:source and a up:evidence triple. The up:evidence will give the ECO code. While the source will give a link to a pubmed or other database record.

So a full query (here limited to P05067) will give an annotation focused table, and give the evidences and sources as applicable to the annotation. This can be downloaded as CSV or JSON for further use.

PREFIX uniprotkb:<http://purl.uniprot.org/uniprot/> 
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#> 
PREFIX faldo:<http://biohackathon.org/resource/faldo#> 
PREFIX up:<http://purl.uniprot.org/core/> 
SELECT 
    ?protein 
    (GROUP_CONCAT(?evidence; separator=',')  as ?evidences)
    (GROUP_CONCAT(?source; separator =',') as ?sources)
    ?begin 
    ?end
    ?text
WHERE
{
    BIND(uniprotkb:P05067 as ?protein)
    ?protein a up:Protein ;
               up:annotation ?annotation .
    ?annotation a up:Transmembrane_Annotation ;
                  up:range ?range ;
                  rdfs:comment ?text .
    ?range faldo:begin/faldo:position ?begin ;
           faldo:end/faldo:position ?end .
    [] a rdf:Statement ;
             rdf:subject ?protein ; 
             rdf:predicate up:annotation ;
             rdf:object ?annotation ;
             up:attribution ?attribution .
    ?attribution up:evidence ?evidence .
    OPTIONAL {
                ?attribution up:source ?source .
    }
} GROUP BY ?annotation ?protein ?begin ?end ?text

The basic idea of how provenance is modelled in UniProt rdf is described in this abstract but that predates the use of the evidence ontology.

The uniprot website rest service is not super at getting to the evidences, without needing to look at the whole data record in which ever format you prefer.

As always write help@uniprot.org for the fastest help.

ADD COMMENT

Login before adding your answer.

Traffic: 2701 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6