I have been analysing RNA seq data and I wanted to do gene set enrichment analysis with clusterprofiler package. I have used deseq2 to identify differentially expressed genes. I have set lfcThreshold = 1 while calling function results(). I have created a vector with log2 fold changes and Entrez names. I thought that gseGO function from clusterprofiler is the same thing as GSEA function from clusterprofiler is the same thing. Am I wrong? I have run gseGO on my sorted log2 fold changes list and then I ran GSEA function on the same list and specified TERM2GENE to be a gene list downloaded from broad's institute website (c5: GO).
Basicaly this is what I did:
gseaGO1 <- gseGO(geneList = foldchanges, OrgDb = org.Hs.eg.db, ont = 'All', nPerm = 1000, minGSSize = 10, pvalueCutoff = 0.05, verbose = FALSE) c5 <- read.gmt("c5.all.v7.0.entrez.gmt") gseaGO2 <- GSEA(foldchanges, TERM2GENE=c5, minGSSize = 10, nPerm = 1000, pvalueCutoff = 0.05, verbose=FALSE)
The results are very similar but not the same. I can see some of the GO sets in results of both gseaGO1 and gseaGO2 and as far as I can see they have the same enrichment score but different NES value, pvalue, padjusted (however differences are VERY small).
So my questions are: are the gseGO and GSEA functions form clusterprofiler package the same (in a mathematical sense)? Additionally, I have defined c5.all.v7.0.entrez.gmt to be gene set database for GSEA function, but which gene set database is used for gseGO?
Even though I am new to this analysis I have read a lot about it but it still isn't clear to me this. Thank you very much for your time and help.