How to receive gene ontology p-value and number of genes from input in biomaRt
Entering edit mode
7.2 years ago
nashtf ▴ 20

I've been using ToppFunn for gene ontologies and it worked great and fast, but it's a blackbox as to how it gets its results. I'm looking for an open-source, R solution and found biomaRt. I have a few qualms with it, largely it doesn't seem very intuitive as to how one finds the information they need. I have a list of genes I'd like to use for a query, and as outputs I would like the gene ontologies that contain these genes, the number of genes from the input list that are in each ontology, and the p-value. Below is how ToppGene looks and the information it gives, which is great. Having p-value is key, I can filter based on significance. Also being able to access which genes from the input are present in each ontology in the sparse matrix is what I want to re-create.

ToppGene Output

Currently I get a huge list where it each entry matches a gene to an ontology, so each gene has multiple entries with one for each ontology of which it is a member. Is there a way to collapse this output or query better? I would like to create a matrix with each gene as a column and each ontology as a row, values would be 0/1 whether gene is a member of each ontology or not too; so I can do counts and cluster comparisons.

It also can take a while, I'm sure it's possible, but is it easy to download a mart or ensembl to use locally?

mart <- useMart(biomart="ENSEMBL_MART_ENSEMBL", dataset="mmusculus_gene_ensembl")
result <- getBM(attributes=c("illumina_mousewg_6_v2", "go_id", "name_1006"), 
                values=c("ILMN_2651144", "ILMN_1251419", "ILMN_1214841", "ILMN_1214071",
                         "ILMN_2930552", "ILMN_1377919", "ILMN_2618176", "ILMN_2526739",
R biomaRt • 2.0k views
Entering edit mode

Ensembl now has a virtual machine available for download otherwise the databases are available for download on the FTP site.
You could post-process the output to get the format you like.

Entering edit mode

The VM is for the Perl API and contains a working instance of that API that still accesses the main database. It does not contain a local instance of the database, nor a configured instance of biomaRt. The tables that form the gene Mart database can be found here, so you could install these locally and access them.


Login before adding your answer.

Traffic: 1898 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6