Question: Error accessing ensembl biomart via BiomaRt R package
1
gravatar for bioExplorer
17 months ago by
bioExplorer3.7k
bioExplorer3.7k wrote:

Objective

I have a list of ~58K ensembl gene ids of h.sapiens for which I need to extract the gene names, descriptions and other annotations from biomart.


The online way is failing primarily because of the huge list I am uploading, hence, I though to give this a try with the BiomaRt R package. I am trying to access ensembl biomart using following commands

source("http://bioconductor.org/biocLite.R")
biocLite("biomaRt")
library(biomaRt)
listEnsembl()

Error encountered

> listEnsembl()
Request to BioMart web service failed. Verify if you are still connected to the internet.  Alternatively the BioMart web service is temporarily down.  Check http://www.biomart.org and verify if this website is available.
Error: XML content does not seem to be XML:

From the error, it appears to be a problem with the URL - http://www.biomart.org, which I could infact access without any issue.

What I can see is a downtime notice here. Is it something related to this?

Can anybody suggest anything else?

biomart R • 898 views
ADD COMMENTlink modified 17 months ago by Mike Smith1.1k • written 17 months ago by bioExplorer3.7k
1

What version of R and biomaRt are you using? I suspect you might have an old version. While the www.biomart.org website still exists, it ceased to be the central reportistory for BioMart instances quite a while ago. All the defaults in the biomaRt package should now point to www.ensembl.org

You can check the version using the command sessionInfo(), here's mine along with the output i get when running listEnsembl()

> sessionInfo()   
R version 3.4.1 (2017-06-30)   
Platform: x86_64-pc-linux-gnu (64-bit)   
Running under: Linux Mint 18.1
Matrix products: default   BLAS:
/home/msmith/Applications/R/R-3.4.1/lib/libRblas.so LAPACK:
/home/msmith/Applications/R/R-3.4.1/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C        
    [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] biomaRt_2.33.5

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.12         AnnotationDbi_1.39.3 magrittr_1.5        
 [4] BiocGenerics_0.23.1  progress_1.1.2       IRanges_2.11.16     
 [7] bit_1.1-12           R6_2.2.2             rlang_0.1.2         
[10] stringr_1.2.0        blob_1.1.0           tools_3.4.1         
[13] parallel_3.4.1       Biobase_2.37.2       DBI_0.7             
[16] bit64_0.9-7          digest_0.6.12        assertthat_0.2.0    
[19] tibble_1.3.4         S4Vectors_0.15.8     bitops_1.0-6        
[22] RCurl_1.95-4.8       memoise_1.1.0        RSQLite_2.0         
[25] stringi_1.1.5        compiler_3.4.1       prettyunits_1.0.2   
[28] stats4_3.4.1         XML_3.98-1.9        

> listEnsembl()
             biomart               version
1            ensembl      Ensembl Genes 90
2 ENSEMBL_MART_MOUSE      Mouse strains 90
3                snp  Ensembl Variation 90
4         regulation Ensembl Regulation 90
ADD REPLYlink written 17 months ago by Mike Smith1.1k
> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS release 6.6 (Final)

locale:
 [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C             
 [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8    
 [5] LC_MONETARY=en_US.utf8    LC_MESSAGES=en_US.utf8   
 [7] LC_PAPER=en_US.utf8       LC_NAME=C                
 [9] LC_ADDRESS=C              LC_TELEPHONE=C           
[11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C      

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] biomaRt_2.30.0       BiocInstaller_1.24.0

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.9          IRanges_2.8.1        XML_3.98-1.5        
 [4] digest_0.6.12        bitops_1.0-6         DBI_0.5-1           
 [7] stats4_3.3.1         RSQLite_1.1-2        S4Vectors_0.12.1    
[10] tools_3.3.1          Biobase_2.34.0       RCurl_1.95-4.8      
[13] parallel_3.3.1       BiocGenerics_0.20.0  AnnotationDbi_1.36.1
[16] memoise_1.0.0
ADD REPLYlink written 17 months ago by bioExplorer3.7k

Working fine for me:

> library(biomaRt)
> listEnsembl()
             biomart               version
1            ensembl      Ensembl Genes 90
2 ENSEMBL_MART_MOUSE      Mouse strains 90
3                snp  Ensembl Variation 90
4         regulation Ensembl Regulation 90
>
ADD REPLYlink written 17 months ago by Emily_Ensembl17k

Check to see if an overzealous intrusion prevention device (or a firewall admin) has disabled your access since it seems to be working for others.

ADD REPLYlink written 17 months ago by genomax63k
3
gravatar for Mike Smith
17 months ago by
Mike Smith1.1k
EMBL Heidelberg / de.NBI
Mike Smith1.1k wrote:

It looks like you're using a fairly old version of both R and biomaRt at the moment. I've made quite a few changes to the package over the past year, particularly regarding connectivity and error messages, so I'd suggest upgrading. You can keep the same version of R and install the latest biomaRt using the following command:

BiocInstaller::biocLite('grimbough/biomaRt')

I would then try re-running listEnsembl() with the verbose flag. This will print the actual URL it is trying to access, which you can then try in a web browser. It should be an XML file starting with <MartRegistry>.

listEnsembl(verbose = TRUE)

You can also try accessing one of the mirror sites, then report back here with any output, e.g.

listEnsembl(verbose = TRUE, mirror = "asia")
ADD COMMENTlink written 17 months ago by Mike Smith1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2188 users visited in the last hour