Question: How Do I Import Rdf Data Into R?
17
gravatar for Egon Willighagen
7.4 years ago by
Maastricht
Egon Willighagen5.1k wrote:

What approach are you using to import Resource Description Framework data into R? There is minimal support with the R package Rredland, but that seems rather spartanic. There was an interesting Rswub, but that was lost in time. I also noted Rsparql, but the project does not seem to have delivered anything yet. And, of course, I can do something manually... what are your best practices to use RDF data from, for example, Bio2RDF?

R web • 9.8k views
ADD COMMENTlink modified 3 months ago by Biostar ♦♦ 20 • written 7.4 years ago by Egon Willighagen5.1k
1

Your first link connects to the Swedish version of wikipedia. For the english version: http://en.wikipedia.org/wiki/Resource_Description_Framework

ADD REPLYlink written 7.4 years ago by David Quigley10k

nice we all Speak swedish RDF :D http://www.youtube.com/watch?v=9OfsABOGw3c&feature=related

ADD REPLYlink written 7.4 years ago by Michael Dondrup43k

Sorry, you lost me... Swedish RDF?

ADD REPLYlink written 7.4 years ago by Egon Willighagen5.1k

Oh, crap... OK, fixing... stupid, we're-so-smart-we-know-where-you-live websites... :(

ADD REPLYlink written 7.4 years ago by Egon Willighagen5.1k

Ah! Sorry about that; fixed now.

ADD REPLYlink written 7.4 years ago by Egon Willighagen5.1k
11
gravatar for Egon Willighagen
6.8 years ago by
Maastricht
Egon Willighagen5.1k wrote:

I started a package for just this purpose yesterday. It is available from CRAN, as functionality is a bit limited today:

library(rrdf)
m1 = load.rdf("one.rdf")
m2 = load.rdf("two.rdf")
m3 = combine.rdf(m1, m2)
summarize.rdf(m3)
sparql.rdf(m3, "SELECT ?s ?p { ?s ?p ?o }")

It is wrapping around Jena and using rJava to interface to it.

There is in fact also a Bioconductor package called Rredland.

Because the rrdf package now also supports SPARQL queries against remote databases, you can also do (following this BioStar answer):

library(rrdf)

endpoint = "http://rdf.farmbio.uu.se/chembl/sparql"

query = "
SELECT ?organism ?instance
WHERE {
  ?instance a <http://rdf.farmbio.uu.se/chembl/onto/#Target> ;
    <http://rdf.farmbio.uu.se/chembl/onto/#organism> ?organism .
}
";

data = sparql.remote(endpoint, query)

As of version 1.4 you can also use on of the SPARQL variables as values for the row names. For example, to get a single column with the protein names as row names, you do:

query = "
SELECT ?organism ?title
WHERE {
  ?instance a <http://rdf.farmbio.uu.se/chembl/onto/#Target> ;
    <http://purl.org/dc/elements/1.1/title> ?title ;
    <http://rdf.farmbio.uu.se/chembl/onto/#organism> ?organism .
}
";

data = sparql.remote(endpoint, query, rowvarname="title")

Resulting in a R matrix like:

                                                      organism                       
Maltase-glucoamylase                                  "Homo sapiens"                 
Sulfonylurea receptor 2                               "Homo sapiens"                 
Voltage-gated T-type calcium channel alpha-1H subunit "Homo sapiens"                 
Dihydrofolate reductase                               "Escherichia coli (strain K12)"
Tyrosine-protein kinase ABL                           "Homo sapiens"                 
DNA-directed RNA polymerase beta chain                "Escherichia coli (strain K12)"
ADD COMMENTlink modified 6.4 years ago • written 6.8 years ago by Egon Willighagen5.1k

I now also created a vignette: http://chem-bla-ics.blogspot.fr/2012/11/triples-stores-and-sparql-in-r.html

ADD REPLYlink written 5.1 years ago by Egon Willighagen5.1k
6
gravatar for Michael Dondrup
7.4 years ago by
Bergen, Norway
Michael Dondrup43k wrote:

The following hints are all far from perfect, and will require some experimenting on your side, but here's my best guess (I got only worst practices for language interfaces, not for reading data from BioRDF):

  • The Redland C library has many language bindings (Perl, Python, Ruby). If these bindings are more complete than Rredland, you could use e.g. the Perl-binding + RPy or RSPerl
  • There are java libraries out there, see the StackExchange answer. They can be interfaced using e.g. SJava or (less nicely) JRI.
  • Pimping the Rredland package to add the functionality you need (maybe most clean but takes a lot of your time)

I would maybe go for the SJava solution first because there at least four java libraries to chose from. I have had some mixed experiences with using language bindings, but in the end RSPerl and SJava worked with Perl and Java for me, and I heard that RPy works nicely too. So it should be possible in principleTM to access the libraries too. Whatever solution you come up with will likely be appreciated by the BioC community.

ADD COMMENTlink modified 7.4 years ago • written 7.4 years ago by Michael Dondrup43k
1

Done, see my own answer.

ADD REPLYlink written 6.7 years ago by Egon Willighagen5.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1290 users visited in the last hour