Entering edit mode
                    2.8 years ago
        4galaxy77
        
    
        
    
    2.9k
    I have annotated some gene names using biomaRt to return the field go_id for each gene, which annotates it with multiple go_ids per gene, e.g.:
   ensembl_gene_id      go_id
 1: ENSG00000261657 GO:0006810
 2: ENSG00000261657 GO:0005739
 3: ENSG00000261657 GO:0005634
 4: ENSG00000261657 GO:0016021
 5: ENSG00000261657
 6: ENSG00000144741 GO:0006810
 7: ENSG00000144741 GO:0005739
 8: ENSG00000144741 GO:0005634
 9: ENSG00000144741 GO:0016021
10: ENSG00000144741 GO:1901962
11: ENSG00000144741 GO:0015805
12: ENSG00000144741 GO:0000095
13: ENSG00000144741 GO:0005743
14: ENSG00000144741
If I choose one of these go_ids, e.g. GO:0005743, and look up the GO term hierarchy, then it shows this hierarchy.
I am most interested in getting the highest level term like cellular_component, from each go_id.
How can I do this in R?
There's nothing like "go_domain", can find
[1] "go_id" "go_linkage_type" "goslim_goa_accession" "goslim_goa_description"Try something like
searchAttributes(mart, pattern='domain')(I don't have biomaRt at hand just now)