biomaRt error while converting ensembl gene Id to uniprot ID
3
0
Entering edit mode
3.6 years ago
manaswwm ▴ 490

Hello everyone,

I have been using biomaRt for some time now and it has been working nicely. However, today while trying to convert some ensembl gene ids to uniprotswissprot ids I have been facing some problems. The code and the error are as follows:

#importing the library
library(biomart)
#since I am working with thaliana - I create the mart object as follows
thaliana_mart = useMart(host="plants.ensembl.org", "plants_mart",
                      dataset = "athaliana_eg_gene")
#running a sample code where I want to retrieve a uniprotswissprot id
getBM(mart = thaliana_mart, attributes = "uniprotswissprot", values = "AT4G21050", filters = "ensembl_gene_id")

Here I expect to get the uniprotswissprot id - Q9SUA9 which is corresponding to AT4G21050, however, I get the following error.

Error in result_create(conn@ptr, statement) : no such table: metadata

Also, something that I should mention : I make requests in bulk. This piece of code is a part of a larger code which runs through 25 ids at a time and does a bunch of other functions. In order to not over-shoot the limit of 55000 requests per hour by ensembl's REST-API, I put the code to "sleep" after one batch of 25 ids for 1800 seconds (half hour).

Can anyone please help me?

biomaRt R ensembl uniprot • 2.2k views
ADD COMMENT
2
Entering edit mode
3.5 years ago

Have you tried the UniProt IDmapping service at https://www.uniprot.org/uploadlists ? It allows to map gene names from and to UniProtKB ACs, you can specifiy an optional organism name and use it programmatically too if necessary (https://www.uniprot.org/help/api%5Fidmapping).

ADD COMMENT
1
Entering edit mode
3.6 years ago

If you have many IDs to match, it may be more efficient to retrieve the entire table and then do post-filtering on that. For example:

library(biomaRt)
thaliana_mart = useMart(
  host = 'plants.ensembl.org',
  'plants_mart',
  dataset = 'athaliana_eg_gene')

annotTable <- getBM(mart = thaliana_mart,
  attributes = c('ensembl_gene_id', 'uniprotswissprot', 'description'))

head(annotTable)
  ensembl_gene_id uniprotswissprot
1       AT5G16970           Q39172
2       AT4G32100                 
3       AT2G43120                 
4       AT2G43120           Q9ZW82
5       AT1G30814                 
6       AT3G18710           Q9LSA6
                                                                                   description
1  NADPH-dependent oxidoreductase 2-alkenal reductase [Source:UniProtKB/Swiss-Prot;Acc:Q39172]
2 Beta-1,3-N-Acetylglucosaminyltransferase family protein [Source:UniProtKB/TrEMBL;Acc:F4JTI5]
3                             RmlC-like cupins superfamily protein [Source:TAIR;Acc:AT2G43120]
4                             RmlC-like cupins superfamily protein [Source:TAIR;Acc:AT2G43120]
5                                             unknown protein; Ha. [Source:TAIR;Acc:AT1G30814]
6                  RING-type E3 ubiquitin transferase [Source:UniProtKB/TrEMBL;Acc:A0A178VJJ8]

dim(annotTable)
[1] 33270     3
ADD COMMENT
0
Entering edit mode

Hi Kevin, thanks for your answer. I agree this would be an efficient way of doing things instead of sending in requests one at a time, however, the point that I would like to make is that I am not able to send in any requests at the moment. For example, even when I run your code I still get the same error as I stated in the question, and I am not entirely sure what the error message means. Here I paste it again:

Error in result_create(conn@ptr, statement) : no such table: metadata

Edit : forgot to add that I have tried to uninstall and reinstall the biomaRt package, however this error still persists

ADD REPLY
0
Entering edit mode

I see. That specific error is thrown from an SQL command. On which system and R version are you running biomaRt?

Also, what would be the output of biomartCacheInfo() if you ran it?

ADD REPLY
0
Entering edit mode

I am running this on a Debian system, to make things easier to understand, here is the sessionInfo()

R version 3.6.1 (2019-07-05)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: Debian GNU/Linux 10 (buster)

Matrix products: default
BLAS/LAPACK: /home/mjoshi/Desktop/work/miniconda3/envs/myTest/lib/R/lib/libRblas.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] biomaRt_2.42.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.1           compiler_3.6.1       pillar_1.3.1         dbplyr_1.4.0         prettyunits_1.0.2    tools_3.6.1          progress_1.2.0      
 [8] digest_0.6.18        bit_1.1-14           RSQLite_2.1.1        memoise_1.1.0        BiocFileCache_1.10.0 tibble_2.1.1         pkgconfig_2.0.2     
[15] rlang_0.3.4          DBI_1.0.0            rstudioapi_0.10      yaml_2.2.0           curl_3.3             parallel_3.6.1       stringr_1.4.0       
[22] httr_1.4.0           dplyr_0.8.0.1        rappdirs_0.3.1       S4Vectors_0.24.0     askpass_1.0          IRanges_2.20.0       hms_0.4.2           
[29] tidyselect_0.2.5     stats4_3.6.1         bit64_0.9-7          glue_1.3.1           Biobase_2.46.0       R6_2.4.0             AnnotationDbi_1.48.0
[36] XML_3.98-1.19        purrr_0.3.2          blob_1.1.1           magrittr_1.5         BiocGenerics_0.32.0  assertthat_0.2.1     stringi_1.4.3       
[43] openssl_1.3          crayon_1.3.4

And if I run biomartCacheInfo(), I still get the same error:

Error in result_create(conn@ptr, statement) : no such table: metadata

Is there something that I did wrong? I also tried making a fresh install of my entire R environment but the error still persists.

ADD REPLY
0
Entering edit mode
3.5 years ago
manaswwm ▴ 490

Thanks for your suggestion, Elisabeth!

I realized I can also do this much later, what I ended up doing is using Uniprot's REST API service to convert my ensembl gene id to uniprot protein id - https://www.ebi.ac.uk/proteins/api/doc/#!/proteins/getByCrossReference (here I set dbtype - EnsemblPlants and reviewed - TRUE to get only UniprotSwissprot ids). So an example requestURL would look like - https://www.ebi.ac.uk/proteins/api/proteins/EnsemblPlants:AT4G21050?offset=0&size=100&reviewed=TRUE

ADD COMMENT
0
Entering edit mode

Yes, this is the EBI Proteins REST API. Other ways to access UniProt programmatically, including the website API, are listed here: https://www.uniprot.org/help/programmatic_access

ADD REPLY

Login before adding your answer.

Traffic: 2714 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6