I found there are many function to perform pathway analysis such as fgsea(), gseGO(), gseKEGG(), enrichGO() which made me quite confuse which result I should focus on. Should we focus on pathways that share between each function? Getting a correct background gene set is important. However, how can we find the background gene set for our experiment. I download the background gene at https://www.gsea-msigdb.org/gsea/msigdb/human/collections.jsp#C3 Only fgsea() mention about background gene but not customize for each experiment. I appreciate your help!
Pathway Enrichment Analysis can be split into two categories.
Functional Class Scoring Methods
Overrepresentation Analysis
"GSEA" is #2 above whereas some of the others above are #1.
Regarding background data set, I'll give an example for you to consider.
If you do differential RNA-Seq analysis, you might only take the genes that are "actually expressed".
This could be all genes where the average normalized expression across all samples is above some threshold "x".
Thank you for your reply! Do you have code that I and others can implement? I am not sure how to customize the gmt file from my differential expressed genes. From the functions I used, I see only enrichGO() used a cut off so I think only enrichGO() is in #2.
Thank you for your reply! Do you have code that I and others can implement? I am not sure how to customize the gmt file from my differential expressed genes. From the functions I used, I see only
enrichGO()
used a cut off so I think onlyenrichGO()
is in #2.Chris Please look into @Hamid Ghaedi's well documented post on ORA and GSEA, Or this documentation. Hope this helps.
Thanks luffy! Great material but seem they didn't mention how to customize the background genes depend on our data.