Entering edit mode
8.8 years ago
agonic
▴
20
I have three subsets of a set of genes. Each subset is a ranked list. I want to calculate what the statistically most significant overlap between the three subsets looks like, i.e. how many of the top-ranked genes I need to take from each list.
So far I've thought about using the multivariate hypergeometric distribution (similar to the same problem with two lists), but I couldn't figure yet out how to use it to get a significance score/p-value.
Any ideas how to approach this problem or papers that I should read?