Question: Retrieve all genes under a mammalian phenotype ontology term
gravatar for eric.kern13
3.2 years ago by
United States
eric.kern13180 wrote:

I want to retrieve all genes corresponding to a given mammalian phenotype ontology term (for example, MP:0005375), preferably within R. Are there tools to do this? Can I do it within BioMart? Or is the best bet to build something around the APIs here or here?

Related: similar question for GO terms

ADD COMMENTlink modified 3.2 years ago by Mike Smith1.5k • written 3.2 years ago by eric.kern13180

you may like to add biomart in the tags

ADD REPLYlink written 3.2 years ago by Santosh Anand5.1k

I tried to; it didn't work. I'll try again.

ADD REPLYlink written 3.2 years ago by eric.kern13180
gravatar for Mike Smith
3.2 years ago by
Mike Smith1.5k
EMBL Heidelberg / de.NBI
Mike Smith1.5k wrote:

I'm not sure you can do this using Ensembl's BioMart. There you can filter using specific phenotype ontology terms, but only leaf terms rather than something quite high level like MP:0005375 which is Adipose Tissue Phenotype and has many sub-terms. I don't think you can query using the phenotype ID itself. I don't know if any other data store that has this data provides a BioMart interface, but I can't see on for the two you linked to.

One suggestion is to use the httr package and query MouseMine directly. Here's a fairly crude example, where we query for your phenotype ID, and return the primary ID, gene symbol, and the NCBI Entrez Gene ID.

Load the libraries we'll need, and then create a search query XML string


phenotypeID <- "MP:0005375"

query <- paste0('<query model="genomic" view="Gene.primaryIdentifier Gene.symbol Gene.ncbiGeneNumber" >
                  <constraint path="Gene.ontologyAnnotations.ontologyTerm.identifier" op="=" code="A" value="',
                  phenotypeID, '" />

Then we can submitt the query:

postRes = POST('',
         body=list(query=query, format='json'),

Now do some processing to the result to give us a data_table with one row per gene

jsonToTxt <- fromJSON(content(postRes, as = "text"))
genes <- as_tibble(jsonToTxt$results)
colnames(genes) <- jsonToTxt$columnHeaders

Here's the output:

> genes
# A tibble: 69 × 3
   `Gene > Primary Identifier` `Gene > Symbol` `Gene > NCBI Gene Number`
                         <chr>           <chr>                     <chr>
1                   MGI:101884           Ppard                     19015
2                   MGI:101900           Mmp14                     17387
3                   MGI:102797           Acsl1                     14081
4                   MGI:102858           Fosl2                     14284
5                   MGI:103014            Il15                     16168
6                   MGI:104993            Lepr                     16847
7                   MGI:105304           Il6ra                     16194
8                   MGI:105374           Npy4r                     19065
9                   MGI:106387         Arfgef3                    215821
10                  MGI:107571            Cav2                     12390
# ... with 59 more rows
ADD COMMENTlink written 3.2 years ago by Mike Smith1.5k

Works like a charm. Thank you very much!

ADD REPLYlink written 3.2 years ago by eric.kern13180
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1064 users visited in the last hour