biomart webservice error when using data from Phytozome
1
1
Entering edit mode
4.1 years ago
elcortegano ▴ 200

I'm trying to give it a look at the phytozome database in biomart, but just as I want to explore a list of known loci, the program returns an error message related to biomaRt webservice. I have tried changing the Mart from phytozome to ensembl, and seems to be related to that. However, I don't see how that connects to the error given, do you have any hints to helping fix this?

This is my code:

library(biomaRt)
mart <- useMart(biomart = "phytozome_mart", dataset = "phytozome", host = "phytozome.jgi.doe.gov")
getBM(attributes = c("organism_name", "gene_name1"), filters = "gene_name_filter", values = "g400", mart = mart)
getBM(attributes = "organism_name", mart = mart)

Which returns error message:

NULL Error in .processResults(postRes, mart = mart, sep = sep, fullXmlQuery = fullXmlQuery, : The query to the BioMart webservice returned an invalid result. The number of columns in the result table does not equal the number of attributes in the query. Please report this on the support site at http://support.bioconductor.org

However, as mentoned above, I don't get this error if I try a similar approach with other databases, e.g. using Ensembl with pig data:

mart <- useMart(biomart = "ensembl", dataset = "sscrofa_gene_ensembl")
getBM(attributes = c("ensembl_gene_id"), mart = mart) %>% head

Should I really contact biomaRt support for this issue? Thank you

R software error • 2.3k views
ADD COMMENT
2
Entering edit mode

I would indeed post this over at the Bioc support forum because there you will get an answer from the maintainers.

ADD REPLY
0
Entering edit mode

I got a different error:

> getBM(attributes = c("organism_name", "gene_name1"), filters = "gene_name_filter", values = "g400", mart = mart)
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  : 
line 1 did not have 3 elements

I'm using biomaRt_2.40.5

Make sure you're using the latest version (2.44)

ADD REPLY
0
Entering edit mode

I just updated to biomaRt 2.44, but the same error remains. Wonder why that different errors

ADD REPLY
2
Entering edit mode

The two error messages are caused by the same issue, but the wording was changed somewhere between version 2.40 and 2.44.

A generic "Error in scan().." is less helpful to diagnose an issue than a message saying the problem is the number of columns in the result table, but really they're the same.

That said, the old error message mentions wanting 3 items, but you're only asking for 2 attributes to be returned. This makes me think there really is something weird going on with the table returned by the server. Sometimes values with spaces or special characters aren't handled correctly. Getting an error here is normally out of the control of the user, hence the request to report it. I'll take a look & report back.

ADD REPLY
0
Entering edit mode

Are you able to list the contents:

> listMarts(host = "phytozome.jgi.doe.gov")
                    biomart                  version
1            phytozome_mart V12 Genomes and Families
2 phytozome_diversity__mart     V12 Genome Diversity
3    phytozome_mart_archive           Genome Archive

> mart <- useMart(biomart = "phytozome_mart", dataset = "phytozome", host = "phytozome.jgi.doe.gov")

> listDatasets(mart)
                 dataset           description version
    1          phytozome  Phytozome 12 Genomes        
    2 phytozome_clusters Phytozome 12 Families
ADD REPLY
0
Entering edit mode

Yes, I get the same output

ADD REPLY
0
Entering edit mode

I get the same error as @Asaf so I assume the next command must have an error in it.

After upgrading R (4.0.1)/BioMart(2.45) I get a new error.

Error in .processResults(postRes, mart = mart, sep = sep, fullXmlQuery = fullXmlQuery,  : 
  The query to the BioMart webservice returned an invalid result.
The number of columns in the result table does not equal the number of attributes in the query.
Please report this on the support site at http://support.bioconductor.org
ADD REPLY
5
Entering edit mode
4.1 years ago
Mike Smith ★ 2.1k

The solution here is to make sure you include https:// in the host argument for Phytozome. If that isn't included the query is redirected back to the main Phytozome page rather than the BioMart service, R receives the home page HTML rather than a results table, and thus it fails.

library(biomaRt)
mart <- useMart(biomart = "phytozome_mart", 
                dataset = "phytozome", 
                host = "https://phytozome.jgi.doe.gov")

getBM(attributes = c("organism_name", "gene_name1"), 
      filters = "gene_name_filter", 
      values = "g400", 
      mart = mart)

#> [1] organism_name gene_name1   
#> <0 rows> (or 0-length row.names)

The value "g400" doesn't seem like a valid gene name and we get an empty result, but if we use one that does exist we get:

getBM(attributes = c("organism_name", "gene_name1"), 
      filters = "gene_name_filter", 
      values = "82092", 
      mart = mart)

#>     organism_name gene_name1
#> 1 Smoellendorffii      82092

I would not recommend running the version without anything supplied to the values and filters arguments. BioMart is not designed as a bulk data retrieval service and will probably fail (possibly without you realising by just omitting results). Better to register with Phytozome and download directly.

I'll update the biomaRt vignette to highlight that https:// needs to be set for Phytozome.

ADD COMMENT
0
Entering edit mode

Thank you for the solution, and for the advice!

ADD REPLY

Login before adding your answer.

Traffic: 2990 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6