I was trying to apply GO enrichment analysis using topGO to rice dataset with the help of Reference 1 and Reference 2. As per the manual and these references, it is a good idea to provide to p-value for ranking the genes.
However, I am working on a DEG list that was commonly identified in different conditions (not a specific comparison). In that case, how can I rank the genes based on p-val or FDR? I am not sure how to input the DEGs and make a ranking vector
Fisher test and KM test produce different GO enrichment result. Which one to select?
I use following code,
mart <- biomaRt::useMart(biomart = "plants_mart", dataset = "osativa_eg_gene", host = 'plants.ensembl.org') get_go <- biomaRt::getBM(attributes = c( "ensembl_gene_id", "go_id"), mart = mart) get_go <- get_go[get_go$go_id != '',] geneID2GO <- by(get_go$go_id, get_go$ensembl_gene_id, function(x) as.character(x)) all.genes <- sort(unique(as.character(get_go$ensembl_gene_id))) #Input list ????? go.obj <- new("topGOdata", ontology='BP' , allGenes = int.genes , annot = annFUN.gene2GO , gene2GO = geneID2GO , nodeSize = 10 ) #Fisher test results <- runTest(go.obj, algorithm = "elim", statistic = "fisher") results.tab <- GenTable(object = go.obj, elimFisher = results) #Kolmogorov-Smirnov (K-S) test results.ks <- runTest(go.obj, algorithm="classic", statistic="ks") goEnrichment <- GenTable(go.obj, KS=results.ks, orderBy="KS", topNodes=20) goEnrichment <- goEnrichment[goEnrichment$KS<0.05,] goEnrichment <- goEnrichment[,c("GO.ID","Term","KS")] goEnrichment$Term <- gsub(" [a-z]*\\.\\.\\.$", "", goEnrichment$Term) goEnrichment$Term <- gsub("\\.\\.\\.$", "", goEnrichment$Term) goEnrichment$Term <- paste(goEnrichment$GO.ID, goEnrichment$Term, sep=", ") goEnrichment$Term <- factor(goEnrichment$Term, levels=rev(goEnrichment$Term)) goEnrichment$KS <- as.numeric(goEnrichment$KS)