I am trying to use GSEA GUI from broad institute to do gene set analysis on RNA seq data. I have been reading many posts and researched GSEA website about the DEseq2->GSEA workflow and here is what I understood from it.
So if I used DEseq2 package to get a list of DE genes and if I would like to run preranked GSEA function,
- get a table with the list of genes on the row and log2FoldChange, p-value, and adj p-values on the column
- order the gene list by a metric -log10(p-value)*sign(logFC) and create rank file (.rnk) in R
- load this file to GSEA software and run GSEApreranked after choosing required and basic fields (making sure enrichment statistic is "classic")
Am I on the right track in understanding this workflow?
As you use R pipeline, I recommend to use R implementation of pre-ranked GSEA: either through fgsea package or clusterprofiler/DOSE interface. It's the same method but much faster.
Answering your question, I normally use stat column of DESeq2 results, but your metric should also work fine.