Question: linking ensembl gene ID to GO term?
0
gravatar for user
6.2 years ago by
user850
United States
user850 wrote:

Is there a table that can be downloaded from FTP or accessed programmatically that links Ensembl ID for a given genome (like 'hg18' or 'mm9') to their GO terms - ids of the form "GO:..."? Is there a UCSC table that does this? I did not see any such table in: http://hgdownload.cse.ucsc.edu/goldenPath/mm9/database/

ADD COMMENTlink modified 6.2 years ago by seidel7.1k • written 6.2 years ago by user850
8
gravatar for seidel
6.2 years ago by
seidel7.1k
United States
seidel7.1k wrote:

You can create a table fairly easily using R and biomart. The code below makes a table from ensembl, which you could export or write to disk, and also puts the result in a list-like format, which is a convenient R data structure:

library(biomaRt)
# select mart and data set
bm <- useMart("ensembl")
bm <- useDataset("mmusculus_gene_ensembl", mart=bm)

# Get ensembl gene ids and GO terms
EG2GO <- getBM(mart=bm, attributes=c('ensembl_gene_id','external_gene_id','go_id'))

# examine result
head(EG2GO,15)

# Remove blank entries
EG2GO <- EG2GO[EG2GO$go_id != '',]

# convert from table format to list format
geneID2GO <- by(EG2GO$go_id,
                EG2GO$ensembl_gene_id,
                function(x) as.character(x))

# examine result
head(geneID2GO)

# terms can be accessed using gene ids in various ways
> geneID2GO$ENSMUSG00000098488
[1] "GO:0009395" "GO:0008152" "GO:0005829" "GO:0030659" "GO:0004620"
[6] "GO:0004623" "GO:0046872" "GO:0005515"
> geneID2GO[['ENSMUSG00000098488']]
[1] "GO:0009395" "GO:0008152" "GO:0005829" "GO:0030659" "GO:0004620"
[6] "GO:0004623" "GO:0046872" "GO:0005515"
ADD COMMENTlink modified 7 months ago by RamRS28k • written 6.2 years ago by seidel7.1k
2

Or, if you prefer, using pointy-clicky BioMart. See the help video here.

ADD REPLYlink modified 6.2 years ago • written 6.2 years ago by Emily_Ensembl21k
2

As I say every few months: the answer to almost every "how to convert ID X to ID Y" question is BioMart, or UCSC tables.

ADD REPLYlink written 6.2 years ago by Neilfws48k

On BioMart, when you return a table with the GO accession number, each gene is only associated with a single GO term. Shouldn't there be many GO terms for most genes? Which one does BioMart choose?

ADD REPLYlink written 6 weeks ago by MaxF70
0
gravatar for Chris Fields
6.2 years ago by
Chris Fields2.1k
University of Illinois Urbana-Champaign
Chris Fields2.1k wrote:

The UCSC table browser should have this, though it may require a little digging to get all the relevant info together (it's not exactly user friendly unless you understand SQL). I typically go with biomart myself, which may have the UCSC Ids as well.

ADD COMMENTlink modified 7 months ago by RamRS28k • written 6.2 years ago by Chris Fields2.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 945 users visited in the last hour