How to calculate probability of overlap between two sets
1
0
Entering edit mode
9 months ago
Apex92 ▴ 200

I have a question about calculating the hypergeometric test.

I have two sets of random names and a number that shows the overlap:

the number of names in set A: 488
the number of names in set B: 312
the number of overlap between the two sets: 16


I want to calculate if the number of overlap between the two sets is significant or is by chance?

How can I perform this by having ONLY three values as input?

Thank you.

overlap r statistics • 1.1k views
0
Entering edit mode
1
Entering edit mode
9 months ago
Ventrilocus ▴ 160

Firstly, please avoid cross-posting the same questions on several platforms (https://stackoverflow.com/questions/69142468/hypergeometric-distribution-by-phyper/69143616#69143616).

Secondly, to do such enrichment test, there is no need to implement it yourself; check out RVenn::enrichment_test (https://www.rdocumentation.org/packages/RVenn/versions/1.1.0/topics/enrichment_test). It implements a bootstrap-based hypergeometric test. However, be careful defining the universe (the complete set from which setA and setB were chosen from).