Question

How to run protein-protein interaction on R using STRING database

0

Entering edit mode

4.0 years ago

tpm ▴ 30

The question is on linking the dataset on R to the STRING database in order to find protein to protein interactions. I would greatly appreciate the help very much or ideas on linking the dataset i.e. what codes I should use for such a case. Below I show the codes I used and the error that resulted afterwards when I wanted to load my dataset. Basically the proteins I want to establish a network for are in this sharable link: https://drive.google.com/file/d/1aJisbhWyqUFcx_wIBMxcDtw5fMIE-z5d/view?usp=sharing

install.packages("STRINGdb")
library(BiocManager)
library("STRINGdb")
string_db <- STRINGdb$new( version="11", species=469008, score_threshold=00, input_directory="")
library(readxl)
p <- read_excel("1.xlsx")
View(p)
p_mapped <- string_db$map( p, "gene", removeUnmappedRows = TRUE )
**Error in tempMatr[i, ] : incorrect number of dimensions**

R stringdb • 1.4k views

ADD COMMENT • link updated 4.0 years ago by Jean-Karim Heriche 27k • written 4.0 years ago by tpm ▴ 30

score 2 · Answer 1 · 2020-05-06

2

Entering edit mode

4.0 years ago

Jean-Karim Heriche 27k

First you don't need 'install.packages("STRINGdb")' since STRINGdb is a Bioconductor package. Second, the error is due to your use of read_excel(). It returns a tibble whereas string_db$map() expects a data.frame. Your code should run fine if you convert p to a data frame or use the read.xlsx() function from package xlsx instead. This should probably be reported as a bug to the package author though as the function should probably check more rigorously its input for validity (may not be easy since tibbles also claim to be data frames but don't actually behave like ones in all cases).

ADD COMMENT • link 4.0 years ago by Jean-Karim Heriche 27k

0

Entering edit mode

Thank you for the response. I am still a newbie on R, but I`m trying to learn. I converted p to a data frame. I got an error still, and I suspect its the tibbles issue you mentioned. If there's suggestions or modifications I would greatly appreciate it very much :)

BiocaManager::install()
BiocManager::install(c("GenomicFeatures", "AnnotationDbi"))
install.packages("BiocManager")
library(BiocManager)
library("STRINGdb")
string_db <- STRINGdb$new( version="11", species=469008, score_threshold=00, input_directory="")
install.packages("xlsx")
library(readxl)
p <- read_excel("1.xlsx")
View(pr)
pr = data.frame(p)
pr
pr_mapped <- string_db$map( pr, "gene", removeUnmappedRows = TRUE )
#Warning:  we couldn't map to STRING 100% of your identifiers

ADD REPLY • link 4.0 years ago by tpm ▴ 30

0

Entering edit mode

Now the code is working. However, it's telling you that it can't find your identifiers. This is because you're not using identifiers that STRING knows about. Check what types of identifiers/names STRING knows about for your organism and use one of these types.

ADD REPLY • link 4.0 years ago by Jean-Karim Heriche 27k