**1.3k**wrote:

I have several tables of results from different DESeq2 runs. The next step would be to do GO enrichment or GSEA enrichment analysis.

For that I would like to create a ranked list of genes for GSEAPreRanked. But I'm not sure which value to take for the ranking. Do I use the log2FC values or the p-values, or even the adjusted p-values.

I have searched in different foren and the opinions varied.

When I use this command `sign(resultsObject$log2FoldChange) * -log10(resultsObject$padj)`

I get `Inf`

, if the `padj=0`

.

FOr the GO enrichment I can use the `goseq`

package, for the gsea I wanted to use `fgsea`

, which does need a ranked gene list.

Is it better to rank the list by significance (adj. p-values) or by expression intensity ( fold-change)?

I would appreciate your opinions and/or reccomendations

thanks, Assa

**540**• written 7 weeks ago by Assa Yeroslaviz •

**1.3k**

I know it's very common, but I am personally a little worried about using p-values as the ranking. You can have very strong changes with high p-values and very subtle changes with low p-values.

There is a nice example here where they use the test statistic as the ranking, which is a nice strategy: https://stephenturner.github.io/deseq-to-fgsea/

9.8kthanks for the link. it is a very god example.

1.3kI'd recommend against using p-adjusted values; use the unadjusted p-values instead. The default FDR adjustment squashes genes to have the same adjusted p-value, despite having different input p-values. The distribution of logFC is different for genes with a different average expression level, this is why I tend to rank on the signed p-values rather than the FCs.

5.2kGood point about the same adjusted p-values. On a related note, there will also be a lot of adjusted p-values that are 1. Other than that, the adjusted and unadjusted p-values will correlate, so the rank order will be the same.

9.8k