GSEA p value dependent on pre-ranked list size?
8 weeks ago
garfield320 ▴ 10

I read that very small or very large gene sets can affect the enrichment score and p values in a GSEA analysis (e.g. if there are 10 genes in a gene set vs. 100 genes in a gene set, the former would be more likely to return a significant p value), and that's why GSEA normalizes for variations in gene set size.

I'm curious if a very small or very large pre-ranked gene list can also affect the p values of the enrichment scores. For instance, if I detected 1000 genes in my experiment and 100 of them were differentially expressed vs. if I detected 100 genes in my experiment and 10 of them were differentially enriched, would these different inputs affect the significance of pathways at all?

