I have 2 lists of genes
- Gene list A consists of ~500 differentially expressed genes (DEGs) from a microarray analysis
- Gene list B is a list of 200 susceptibility genes drawn from various GWAS
I have performed pathway over-representation analysis (ORA) on each list using innateDB and found that there were ~150 and ~115 over-represented pathways involving genes from gene list A and B respectively. Now I wish to compare the 2 lists of over-represented pathways for possible overlap.
Using Venny, I have found that ~50 of them overlap but I understand that I also need to perform a statistical test to ensure that this overlap is not just due to chance. However, I am unsure of what statistical test I should perform and how I can go about doing so.
As part of the analysis of these gene lists, I have previously done a hypergeometric distribution test to check for overlap between the genes themselves, but I do not know if this same test can be applied to pathways and if so how should I go about doing it (i.e. I am unsure of how to define the parameters of the hypergeometric distribution test in this situation). I have also read some studies that use Fisher's method to combine the p-values and was wondering if this is the method I should be using.
Unfortunately I have no experience with R. Regardless, thank you for all your help!
Tl;dr: I have overlapped 2 pathway lists and found that they overlap but do not know what statistical test should be used to give evidence that these results are not due to chance.
Edit: I have also read this thread which relates to my question which states that it a suitable test may be a Fisher's F test. However, I don't understand the procedure I need to follow to get my desired p-value.