I am doing Over-Representation pathway analysis as described here.

In this way, the p-value is calculated for each pathway as follows:

where N is the total gene number on all pathways. K is the gene number on a certain pathway. M is the number of genes of interest. x is the number of genes hit on this pathway.

Assume, we have 56340 genes on all pathways (N=56340), the number of genes of interest is 1 (M=1), and the number of genes in given pathway is 30 (K=30). Moreover, none of genes of interest hit the pathway, so x=0.

If we calculate p-value, we get

p =1- dhyper (0, 30, 56340-30, 1) = 0.0005324814

(Please note it is less than 0.01.)

After calculating the p-values for all of pathways, I obtain many p-values less than 0.01.

Now my question is: how can I choose enriched pathways based on the obtained p-values? As you can see when none of genes of interest hit the given pathway, we obtain p-value less than 0.01, as well. Is it rational if I consider pathways without any common gene with genes of interest as enriched pathway?

Can anyone help me?

Thanks

For enriched pathways, you'd just look at a one-tailed probability...