I have 2 gene lists with 13 members, called A and B. The list A, includes main genes which I am sure about their functionalities, whereas the list B includes genes which I aim to study them.
I would like to do pathway analysis for list B to find how much the list B is similar to list A. So, I developed two approaches but I am not sure if they are correct and meaningful. Hence, I decided to write them here to know your opinion about them and get help from you guys.
First Approach:
1) Do pathway enrichment analysis for each list of genes distinctly as follows:
for list X and pathway Y, we determine:
a = the number of common genes between list X and pathway Y
b = the number of genes in pathway Y
c = The number of genes in whole pathways - b
d = the number of genes in list X
p-value = dhyper(a, b, c, d)
2) For each list A and B, find pathways with p-value < 0.05, called pathway_A and pathway_B.
3) Find the common pathways between pathway_A and pathway_B.
Finally, the list B is functionally related to list A, if there is a common pathway in step (3).
Second approach:
1) Find pathways which include some genes from list A, called pathway_A.
2) Find pathways which include some genes from list B, called pathway_B.
3) Find common pathways between pathway_A and pathway_B, called pathway_AB.
4) For each pathways in pathway_AB,
4-1) repeat the below instructions for 100 times:
4-1-1) Select 13 genes randomly from all genes in all pathways, called randGenes.
4-1-2) Find pathways which include some genes from randGenes, called randPathway.
4-2) Find the number of times which the interested pathway, selected in step (4), is included in randPathway, called n_rand.
4-3) p-value = n_rand/100.
We do step (4-1) to (4-3) for all pathways in pathway_AB, so we get p-value for each of them. Finally, the list B is functionally related to list A, if there is a pathway with p-value < 0.05.
Thank you!
Why do you only have 13 genes to start with for each A and B? Secondly, you don't seem account for genes which take place in more than one pathway, or have overlap.
Just to follow up, here is a good paper which addresses crosstalk, or the overlap I spoke to above: http://m.genome.cshlp.org/content/23/11/1885.short