Question: Retrieving phytozome data using the R bioconductor package biomaRt
gravatar for Fabio Marroni
3.0 years ago by
Fabio Marroni2.6k
Fabio Marroni2.6k wrote:

Hi, I am trying to use biomaRt to retrieve data for P. trichocarpa from phytozome. This is non standard, because the standard is via ensembl. However, the database has V2.2 of poplar annotation and I would like to work with V3.0 (which is e.g. on phytozome)

I issue the following commands (adpted from what I used to do for retrieving data from ensembl), and obtain the following output:

allattributes<-listAttributes(useDataset(dataset = as.character(mydataset), mart    = useMart("phytozome_mart",host = host)))
#Use a given dataset for analysis
myusemart <- useDataset(as.character(mydataset), mart = useMart("phytozome_mart", host = host))
resultTable <- getBM(attributes="organism_name", mart = myusemart)
> allattributes[1:10,]
                name       description
1      organism_name     Organism Name
2        organism_id       Proteome ID
3         gene_name1         Gene Name
4   gene_description       Description
5          chr_name1   Chromosome Name
6  gene_chrom_strand            Strand
7   gene_chrom_start   Gene Start (bp)
8     gene_chrom_end     Gene End (bp)
9      transcript_id PAC Transcript ID
10  transcript_name1   Transcript Name

> resultTable
2                                                                                       <html><head>
3                                                                           302 Found
4                                                                                      </head><body>



The document has moved>here.

7 </body></html>

So basically the list of the attributes is correctly defined, but when I try to do a query using the main querying function of biomaRt, I get that strange result, obviously meaning that there is something wrong. Did someone succeed in querying phytozome using biomaRt? Any help will be greatly appreciated!

biomart bioconductor R phytozome • 1.8k views
ADD COMMENTlink modified 6 months ago by l.willianpacheco0 • written 3.0 years ago by Fabio Marroni2.6k
gravatar for Mike Smith
3.0 years ago by
Mike Smith1.5k
EMBL Heidelberg / de.NBI
Mike Smith1.5k wrote:

Short answer is that I think for now you have to bypass some of the biomaRt functions, and create a Mart object yourself. So give this a try:


phytozomeMart <- new("Mart", 
                 biomart = "phytozome_mart",
                 vschema = "zome_mart", 
                 host = "")

The rest of your code should work using this object now, e.g.:

mysets <- listDatasets(phytozomeMart)
mydataset <- mysets$dataset[mysets$dataset == "phytozome"]
myusemart <- useDataset(as.character(mydataset), mart = phytozomeMart)

allattributes <- listAttributes(mart = myusemart)
resultTable <- getBM(attributes = "organism_name", mart = myusemart)

Checking the content:

> resultTable[1:5, ,drop = FALSE]
1 Smoellendorffii
2         Cpapaya
3       Rcommunis
4        Csativus
5       Vvinifera

As for why this is necessary, it looks like this instance of BioMart automatically redirects you to a https server when you run the query, and you need to access port 443. Although you can provide a port argument to useMart, it is currently overridden if there's a port specified in the registry on the server. For the mart, that registry is located at, and referencs port 80 throughout. Since biomaRt expects that document to be up-to-date it uses the values there and fails. This is obviously annoying, so I'll have a think about how best to deal with this situation.

ADD COMMENTlink written 3.0 years ago by Mike Smith1.5k

Thank you very much. I am very happy with your patch!

ADD REPLYlink written 3.0 years ago by Fabio Marroni2.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1527 users visited in the last hour