Entering edit mode
7.4 years ago
bvernot
▴
10
Hello all,
I'm doing GO enrichment with topGO, and I get a set of significant GO categories (with some number of genes associated with those categories). When I then try to find the genes corresponding to those categories, using annFUN.org, many of the categories do not exist. Maybe this is something related to pruning the ontology? Any thoughts?
I've included simple code that reproduces this problem:
library(data.table)
library(org.Hs.eg.db)
library(topGO)
## create a fake set of 10 genes, of which one is significant
tmp.sig.genes = data.table(ens_id = c('ENSG00000198752', 'ENSG00000145242', 'ENSG00000127526', 'ENSG00000111110', 'ENSG00000197043', 'ENSG00000186642', 'ENSG00000151952', 'ENSG00000055163', 'ENSG00000154917', 'ENSG00000251664'),
sig = c(T, rep(F, 9)))
allGenesCat <- factor(as.integer(tmp.sig.genes$sig))
names(allGenesCat) <- tmp.sig.genes$ens_id
# run topGO, get significant GO categories
suppressMessages(tgd <- new( "topGOdata", ontology='BP', allGenes = allGenesCat, nodeSize=5,
annot=annFUN.org, mapping="org.Hs.eg.db", ID = "ensembl" ))
resultTopGO.elim <- runTest(tgd, algorithm = "elim", statistic = "Fisher" )
tgd.table = data.table(GenTable( tgd, Fisher.elim = resultTopGO.elim))
## look at our "significant" results
head(tgd.table,1)
# GO.ID Term Annotated Significant Expected Fisher.elim
# 1: GO:0019538 protein metabolic process 5 1 0.5 0.5
# but the first go term doesn't come up when I query with annFUN.org
annFUN.org('BP', mapping="org.Hs.eg.db", ID = "ensembl", feasibleGenes = tmp.sig.genes$ens_id)[['GO:0019538']]
# NULL
# similarly, that term isn't present for the only significant gene
inverseListannFUN.org('BP', mapping="org.Hs.eg.db", ID = "ensembl", feasibleGenes = tmp.sig.genes$ens_id))[['ENSG00000198752']]
# [1] "GO:0006468" "GO:0007010" "GO:0007163" "GO:0007165" "GO:0016477" "GO:0031032" "GO:0031532" "GO:0035556"