I have a dataset of gene-phenotype association in this format. I am looking at some combination of phenotypes and genes shared between combinations. I would like to use a statistical test to show that the genes shared between two phenotypes are statistically significant using a p-value or a similar measure.
22 genes are associated with Phenotype1 205 genes are associated with Phenotype2 9 genes are common between two phenotypes
I want to assess whether the number of genes common to two phenotypes are statistically significant or just a random observation.
I have phenotype information for 4035 genes; I assume that human genome contains 42, 071 genes
How do you address this problem (preferably in R), what statistical test you would recommend and why ?
PS. Edit on Oct 17 2011 I posted this question at stats.stackexchange.com.