I was wondering what people's thoughts are on performing a preranked GSEA type analysis on unsigned / one-sided gene statistics are?
- Fold-change, or,
- Sign(fold-change) * -log10(P-value)
In some situations, however, it may be desirable to perform such an analysis on _unsigned_ gene rankings, to check for over-representation of some annotation on a single side of the gene rankings.
For example, given some scoring function that assigns a positive score to each gene based on some assessment of the gene's association with a phenotype of interest, it might be useful to then perform GSEA on the ranked distribution of gene scores.
In cases where I'm working with _signed_ gene metrics, fgsea has been quite useful due to it's high performance optimization and flexibility with respect to the nature of the input gene scores. However, it too expects data
I found one package(AbsGeneFilter, Yoon et al., 2016) that attempts to directly address this, however, the method is again focused on (RNA-Seq) gene expression analysis, and also lacks some of the helpful performance improvements provided by fgsea.
In theory, any of the aforementioned K-S test or permutation-based methods should also be able to detect enrichment of gene sets on a unsigned ranking such as that described above, however, I was wondering if anyone else has considered this kind of thing in terms of its implications on sensitivity / specificity of the results, or perhaps if there are other approaches that might be worth looking into?