Question: How To Compute The Significance Of The Overlap Of The Size Of The Intersection Of Two Gene Sets.
1
ChIP550 wrote:

Hi!

I have a genelist from a KEGG pathway say TGFb signalling pathway, that has 80 genes in it (set A). and I have second list of genes with me which accounts for 45 genes (set B), and the overlap between the two sets is of 15 genes.

How can I get a p-value associated with this overlap?

Has anyone done this before?

I think, phyper of R can be applied to this .......

Thank you

pathway statistics • 12k views
modified 17 months ago by Vasei30 • written 6.6 years ago by ChIP550

Dear folks,

I wonder what method would fit best to compute the significance of the overlap between three groups of genes? For calculation the p-Values I used the Fisher's exact test. If I understand right, the Fisher's exact test is just practical for a comparison of two groups of genes, right?!

Thanks!

4
PoGibas4.8k wrote:
2
Vikas Bansal2.4k wrote:

A web based CGI scripts that computes the

Statistical significance of the overlap between two groups of genes

via the hypergeometric distribution/

0
Vasei30 wrote:

You can use resampling (sometimes called permutation test) in this case. Assuming your gene list is L, you can sample A of size 80 and B of size 45 independently from L, and see if the size of intersection is more or less than 15. By repeating this many times you get an estimate of probability of an overlap of size as odd as 15 by the assumption of independence between A and B! If this probability is very small this may mean that the assumption of independence is not a good assumption.