Apologies for the stupid question! but I think I am doing something wrong but i do not understand what. I would like to do ORA analysis on bulk-RNAseq dataset so I tried both
clusterProfiler and also
genekitr.` However, despite getting the same terms, but I have different p-adjusted value and q-value (practically with clusterprofiler none of the term have a p.adjusted or value <= 0.01 whereas wit the genekitr I have few). why is that? Do I do something wrong with my code?
# we want the log2 fold change original_gene_list <- d$log2FC # on the unfiltered dataset # name the vector names(original_gene_list) <- d$ENSEMBL # omit any NA values gene_list<-na.omit(original_gene_list) # sort the list in decreasing order (required for clusterProfiler) gene_list = sort(gene_list, decreasing = TRUE) # Exctract significant results (padj < 0.05) sig_genes_df = subset(d, p_value <= 0.05) # From significant results, we want to filter on log2fold change genes <- sig_genes_df$log2FC # Name the vector names(genes) <- sig_genes_df$ENSEMBL # omit NA values genes <- na.omit(genes) # filter on min log2fold change (log2FoldChange > 1.5) genes <- names(genes)[abs(genes) > 1.5] go_enrich <- enrichGO(gene = genes, universe = names(gene_list), OrgDb = org.Hs.eg.db, keyType = "ENSEMBL", readable = T, ont = "BP", pvalueCutoff = 0.05, qvalueCutoff = 0.01)
genekitr i have used this code (section 1.7 :
# 1st step: get input IDs id <- c(dpg6$Associated.Gene.Name) # DEGs # 2nd step: get gene set gs2 <- geneset::getGO(org = "human",ont = "bp") # biological process #analysis ego2 <- genORA(id, geneset = gs2, universe = names (d$ENSEMBL), # bakground aka dataset unfiltered p_cutoff = 0.05, q_cutoff = 0.01) # bp
What I am doing wrong?
Thank you very much for your help!
Thank you! the universe and the genes are, I just used different names because the scripts were written in different times! However I re-run both codes using the same gene/names and the results is the same as before (different p and q values). How do I choose which method? I don`t want to choose genekitr just because it gives me more terms statistically significant that would match my theory if it is not the right approach!
Hi, I'm the author of
genekitr. Thanks for your feedback. Regarding your question, firstly, both
genORAare based on the
enricherfunction for statistical calculations. As @chaco001 said, the main difference lies in the input annotation of terms used, which of course is not limited to GO. ClusterProfiler mainly adopts the
OrgDbmethod, for example, the function uses org.Hs.eg.db to obtain geneset, while genekitr integrates
Panther db (v17.0)and
I love using these tools! They are both easy to use.