Performing GSEA using MSigDB gene sets in R
1
3
Entering edit mode
2.8 years ago
wilsonav ▴ 30

I am trying to perform a gene set enrichment analysis in r using the gene sets available from msigdb and a list of gene names from my own data set.

I am able to to use the msigdbr library to import the gene collections from msigdb into r, but I am unsure of how to specifically use a function to compute the overlaps between the genes in my gene set and the gene sets in msigdb and obtain the FDR p-values. Are there any tutorials online for this method or example codes?

Thank you

R GSEA msigdb • 6.7k views
ADD COMMENT
4
Entering edit mode
2.8 years ago
igor 12k

You can try the fgsea package, which is probably most similar to the original GSEA. It can be run in a single command:

fgseaRes <- fgsea(pathways = examplePathways,  stats = exampleRanks)

Check the vignette for more details: https://www.bioconductor.org/packages/devel/bioc/vignettes/fgsea/inst/doc/fgsea-tutorial.html

There is also an example in the msigdbr vignette: https://cran.r-project.org/web/packages/msigdbr/vignettes/msigdbr-intro.html

ADD COMMENT
1
Entering edit mode

It's worth noting that fgsea is similar to GSEA-Preranked rather than to the original GSEA method published in the GSEA articles that used sample permutation.

ADD REPLY
1
Entering edit mode

Yes. Until very recently, the recommendation from GSEA developers was to use the pre-ranked GSEA for RNA-seq data, so that has been the default one in my mind since most transcriptomic data is RNA-seq.

ADD REPLY
0
Entering edit mode

OK. It's an important issue because GSEA-preranked doesn't give proper FDR control (and hasn't been claimed to do so by the GSEA developers as far as I know).

ADD REPLY

Login before adding your answer.

Traffic: 2150 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6