Question: why does my biomaRt query return inconsistent dataset lists?
0
gravatar for adam.faranda
7 months ago by
adam.faranda10
adam.faranda10 wrote:

I have been using the biomaRt library to retrieve ensembl gene ID's for mouse genes. This moring, I got an unusual error message when running a previously validated script:

mart <- useMart(biomart="ensembl", dataset="mmusculus_gene_ensembl")
Error in useDataset(mart = mart, dataset = dataset, verbose = verbose) : 
  The given dataset:  mmusculus_gene_ensembl , is not valid.  Correct dataset names can be obtained with the listDatasets function.

When I used the "listDatasets" function to check whether "mmusculus_gene_ensembl" is correct, I noticed that the query was returning a different number of results each time I ran it. Sometimes, "mmusculus_gene_ensembl" appears in this result set and other times it does not:

 > nrow(listDatasets(mart, verbose=T))
Attempting web service request:
http://www.ensembl.org:80/biomart/martservice?type=datasets&requestid=biomaRt&mart=ENSEMBL_MART_ENSEMBL
[1] 51
> nrow(listDatasets(mart, verbose=T))
Attempting web service request:
http://www.ensembl.org:80/biomart/martservice?type=datasets&requestid=biomaRt&mart=ENSEMBL_MART_ENSEMBL
[1] 116
> nrow(listDatasets(mart, verbose=T))
Attempting web service request:
http://www.ensembl.org:80/biomart/martservice?type=datasets&requestid=biomaRt&mart=ENSEMBL_MART_ENSEMBL
[1] 27

This behavior has been consistent all day. My R session info is below:

R version 3.3.3 (2017-03-06)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X Mavericks 10.9.5

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] biomaRt_2.30.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0           IRanges_2.8.2        XML_3.98-1.17        digest_0.6.18        bitops_1.0-6         DBI_1.0.0            stats4_3.3.3         RSQLite_2.1.1       
 [9] blob_1.1.1           S4Vectors_0.12.2     tools_3.3.3          bit64_0.9-7          Biobase_2.34.0       RCurl_1.95-4.11      bit_1.1-14           parallel_3.3.3      
[17] BiocGenerics_0.20.0  AnnotationDbi_1.36.2 memoise_1.1.0
R biomart ensembl • 268 views
ADD COMMENTlink modified 6 months ago • written 7 months ago by adam.faranda10
2
gravatar for Mike Smith
7 months ago by
Mike Smith1.4k
EMBL Heidelberg / de.NBI
Mike Smith1.4k wrote:

There was an issue with biomaRt that manifested when Ensembl release 91 introduced datasets with apostrophes in e.g. "Ma's Night Monkey" which would lead to the error you are seeing. See https://support.bioconductor.org/p/104025/#104043 or A: biomaRt mmusculus_gene_ensembl dataset for more details.

You are currently using old versions of both R and biomaRt. I would suggest updating both, in particular you will need biomaRt version 2.34.1 or newer to handle this correctly.

ADD COMMENTlink modified 7 months ago • written 7 months ago by Mike Smith1.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1591 users visited in the last hour