Hi so I have a scenario where I have 200 samples. Each samples I test for 10000 genes. I have two events, call it A and B. Whereby for each sample there can be x genes with gene.up and gene.down. What I want to do is compare if events A and B are similar: I want to see if the intersection is significant. To visualize this I do a venn diagram and see if the intersection is significant. Normally I will do a fisher exact test or hypergeometric test. However its strange here because I have to account for sample, direction and gene. So its coded like this. Sample1.gene.up only this will be consider a match. My question is what then is the total population. For example, if total gene was 1000 is the total population then, 1000 * 2 * n samples. The two because gene can be up or down. Finally it would look something like this. I'm using R.
q = length ( intersect ) m= length( n1 ) k= length(n2) n= 1000 * 2 * total.sample - m phyper(q,m,n,k,lower.tail=F)
for a fisher test it would look something like this.
total.sample = 200 m =matrix ( c( 1000 * 2 * total.sample , 400 , 500 , 700 ) ,nrow=2) fisher.test ( m , alternative = "greater")
I need advice if I'm doing this correctly? especially if the total population is is correctly calculated? thanks!