linking ensembl gene ID to GO term?
2
2
Entering edit mode
7.2 years ago
user ▴ 870

Is there a table that can be downloaded from FTP or accessed programmatically that links Ensembl ID for a given genome (like 'hg18' or 'mm9') to their GO terms - ids of the form "GO:..."? Is there a UCSC table that does this? I did not see any such table in: http://hgdownload.cse.ucsc.edu/goldenPath/mm9/database/

ucsc gene-ontology go gene-ids ensembl • 9.8k views
ADD COMMENT
13
Entering edit mode
7.2 years ago
seidel 8.2k

You can create a table fairly easily using R and biomart. The code below makes a table from ensembl, which you could export or write to disk, and also puts the result in a list-like format, which is a convenient R data structure:

library(biomaRt)
# select mart and data set
bm <- useMart("ensembl")
bm <- useDataset("mmusculus_gene_ensembl", mart=bm)

# Get ensembl gene ids and GO terms
EG2GO <- getBM(mart=bm, attributes=c('ensembl_gene_id','external_gene_id','go_id'))

# examine result
head(EG2GO,15)

# Remove blank entries
EG2GO <- EG2GO[EG2GO$go_id != '',]

# convert from table format to list format
geneID2GO <- by(EG2GO$go_id,
                EG2GO$ensembl_gene_id,
                function(x) as.character(x))

# examine result
head(geneID2GO)

# terms can be accessed using gene ids in various ways
> geneID2GO$ENSMUSG00000098488
[1] "GO:0009395" "GO:0008152" "GO:0005829" "GO:0030659" "GO:0004620"
[6] "GO:0004623" "GO:0046872" "GO:0005515"
> geneID2GO[['ENSMUSG00000098488']]
[1] "GO:0009395" "GO:0008152" "GO:0005829" "GO:0030659" "GO:0004620"
[6] "GO:0004623" "GO:0046872" "GO:0005515"
ADD COMMENT
2
Entering edit mode

Or, if you prefer, using pointy-clicky BioMart. See the help video here.

ADD REPLY
2
Entering edit mode

As I say every few months: the answer to almost every "how to convert ID X to ID Y" question is BioMart, or UCSC tables.

ADD REPLY
0
Entering edit mode

On BioMart, when you return a table with the GO accession number, each gene is only associated with a single GO term. Shouldn't there be many GO terms for most genes? Which one does BioMart choose?

ADD REPLY
0
Entering edit mode
7.2 years ago
Chris Fields ★ 2.2k

The UCSC table browser should have this, though it may require a little digging to get all the relevant info together (it's not exactly user friendly unless you understand SQL). I typically go with biomart myself, which may have the UCSC Ids as well.

ADD COMMENT

Login before adding your answer.

Traffic: 1603 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6