I've read in an older thread that to retrieve all of the gene names associated with a GO id you use the biomaRt package, eg:
library(biomaRt) ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl") gene.data <- getBM(attributes=c('hgnc_symbol', 'ensembl_transcript_id', 'go_id'), filters = 'go_id', values = 'GO:0072599', mart = ensembl)
However, I'm not sure this is actually a correct answer since it returns 1 gene annotation compared to the 109 reported on www.ebi.ac.uk. Is there a more nuanced interpretation of what this one gene is? Is it only genes directly related to the term and no child terms? If so, is it appropriate to retrieve all child terms for the purpose of functional enrichment analysis, or to just use the 1 gene directly related to the term?
It looks like it's just the genes that are directly related to the term, and no child terms. In order to get those I had to use:
When I do this code
I get back a list of just numbers, not ensembl IDs. Any idea what these are or how to convert to ensembl?
I'm 5 years late, but these will probably be entrez gene IDs you can then convert to gene symbols or ensembl IDs