clusterProfiler GSEA for upregulated and downregulated gene sets: conflicting results
2.5 years ago
hibernicah ▴ 30

I'm using GSEA function to perform enrichment analysis on my pre-ranked gene list. The function takes only genes sorted in decreasing order by a metric of choice, so I've ordered genes by decreasing DESeq2 test statistics, which reflects direction and significance of change.

When I revert the order of genes (by changing a reference condition during DESeq2 testing or just by multiplying DESeq2 statistics by -1) I get substantially different GSEA results. Is this behaviour expected?. If so, Can I perform separate tests for sets enriched in each condition? (I think the desktop version of GSEA does that by default.)

GSEA clusterProfiler DESeq2 • 1.5k views
13 months ago
Hannes ▴ 30

What exactly do you mean by DESeq2 test statistic? Are you referring to pvalues or p.adj?

As far as I know clusterProfiler asks you to provide a ranked list genes based on their log2 fold change values (these values will of course differ depending on your comparison groups) This is of course a different rank order as sorted by pvalue, as large log2-fc values often derive from a general small mean expression and a large variance. In consequence the log2fc might be large but the pvalue might not be significant.

I personally like the idea of using apeglm shrunk log2fc values as large log2-fc with a large variance get penalized and shrinked. See Zhu et al 2019

Please provide a few more information on what exactly you are using as input for ranking your genes.


