22 months ago by
wilsonav30 wrote:

I am trying to perform a gene set enrichment analysis in r using the gene sets available from msigdb and a list of gene names from my own data set.

I am able to to use the msigdbr library to import the gene collections from msigdb into r, but I am unsure of how to specifically use a function to compute the overlaps between the genes in my gene set and the gene sets in msigdb and obtain the FDR p-values. Are there any tutorials online for this method or example codes?

Thank you

modified 22 months ago by igor11k • written 22 months ago by wilsonav30
22 months ago by
United States
igor11k wrote:

You can try the fgsea package, which is probably most similar to the original GSEA. It can be run in a single command:

fgseaRes <- fgsea(pathways = examplePathways,  stats = exampleRanks)

Check the vignette for more details:

There is also an example in the msigdbr vignette:

modified 5 months ago • written 22 months ago by igor11k

It's worth noting that fgsea is similar to GSEA-Preranked rather than to the original GSEA method published in the GSEA articles that used sample permutation.

modified 6 weeks ago • written 6 weeks ago by Gordon Smyth1.8k

Yes. Until very recently, the recommendation from GSEA developers was to use the pre-ranked GSEA for RNA-seq data, so that has been the default one in my mind since most transcriptomic data is RNA-seq.

written 6 weeks ago by igor11k

OK. It's an important issue because GSEA-preranked doesn't give proper FDR control (and hasn't been claimed to do so by the GSEA developers as far as I know).

written 6 weeks ago by Gordon Smyth1.8k
