Hypergeometric Test Of Microarrays Gene Lists
1
1
Entering edit mode
8.7 years ago
Assa Yeroslaviz ★ 1.6k

Hi,

we have several microarrays experiments, for which we have the list of differentially regulated genes. We analyzed the overlap for each of the pairs and now would like to know how significance are these overlaps.

I did it with the phyper function this way: set 1 mit 2

 totalNumarrays = 21542 # total number of array probes
DEgene_set1 = 1453 # differentially regulated genes of set 1
DEgene_set2 = 4987 # differentially regulated genes of set 2
overlap =481 # overlap between the two sets.
Prob = phyper(overlap -1, DEgene_set1, totalNumarrays, DEgene_set2, lower.tail=FALSE, log.p = FALSE)


The same was done for set1 1 vs. 3 and 1 vs. 4 wth the same total amount of genes.

Now, my problem is, that sets 3 and 4 are from different technologies. They have different total number of array probes.well, my question is basically - does it matter?

Do I need to modify the formula to get the correct results?

I would appreciate any help

Assa

microarray overlap statistics • 3.0k views
1
Entering edit mode
8.7 years ago
Sudeep ★ 1.7k

I assume you are interested only in finding the significance of overlap between the DEGs in the arrays you have, and that you have some kind of mapping from your array probe ids to a database. Then shouldn't you be taking totalNumarrays as not the total number of array probes, but the union of all the probes that could be mapped to genes in the arrays you calculate significance for as the universal list ? and IMHO I don't think that it does matter that the arrays are from different technologies, because in this case it is just calculating the significance of overlap between two lists A and B that are subsets of a super-set C, isn't it ?. One more thing, in phyper function, why are you taking overlap -1 instead of overlap ?