This problem is in R/Bioconductor, topGO and GOstats packages. I want to perform HyperGeometric test for OVER representation against GO and KEGG. I have go two text files locally available: Back.txt (background with only entrez id and more then 2000 ID's- Illumina HT12 v3. microarray) and **genes.txt** (differently expressed genes- Illumina HT12 v3. microarray). The result should be in R data.frame with following fileds (GO_term_id, Go_term_name, p_value and number of associated genes from my genes list(Genes.txt file)) and visualizing in the Gograph or KEGG pathway.
I have perform hyperGTest and got the result but I am not sure whether the generated result is correct or not as per my OVER represented need against GO and KEGG. How can I check my result with other databases for consistency? I have tried with DAVID but couldn't figure out.
I have following code
library(topGO) library(GOstats) universe=read.table("Back.txt", sep=",") # Background files where only entrez id's are listed without heading column tbl <- read.table ("genes.txt", sep=",") # selected genes with following header Probes_id,entrez_gene_id,symbols,P.Value and F.C selected=<-tbl$V2 # Selecting only second column of tbl vector where entrez_gene_id is present param<- new ("GOHyperGParams", geneIds = selected, universeGeneIds=universe, annotation="org.Hs.eg.db", ontology="BP",pvalueCutoff=0.1, conditional=FALSE,testDirection="over") hyp<-hyperGTest(param) summary(hyp)
For the trial I wrote summary(hyp) which gives me GOBPID, Pvalue, OddsRatio,ExpCount,Count, Size and Term.
But I want to get result in data.frame with following fields consisting GO_term_id, Go_term_name, p_value and number of associated genes from my genes list(Genes.txt file)
I have provided the genes.txt sample link for convinence and Back.txt contains only entrez_id
*Note- This is a follow up question.