Question: Pathway analysis for two lists of genes
5.4 years ago by
Na Sed310
United States
Na Sed310 wrote:

I have 2 gene lists with 13 members, called A and B. The list A, includes main genes which I am sure about their functionalities, whereas the list B includes genes which I aim to study them.

I would like to do pathway analysis for list B to find how much the list B is similar to list A. So, I developed two approaches but I am not sure if they are correct and meaningful. Hence, I decided to write them here to know your opinion about them and get help from you guys.

First Approach:

1) Do pathway enrichment analysis  for each list of genes distinctly as follows:

```for list X and pathway Y, we determine:

a = the number of common genes between list X and pathway Y
b = the number of genes in pathway Y
c = The number of genes in whole pathways - b
d = the number of genes in list X

p-value = dhyper(a, b, c, d)```

2) For each list A and B, find pathways with p-value < 0.05, called pathway_A and pathway_B.

3) Find the common pathways between pathway_A and pathway_B.

Finally, the list B is functionally related to list A, if there is a common pathway in step (3).

Second approach:

1) Find pathways which include some genes from list A, called pathway_A.

2) Find pathways which include some genes from list B, called pathway_B.

3) Find common pathways between pathway_A and pathway_B, called pathway_AB.

4) For each pathways in pathway_AB,

4-1) repeat the below instructions for 100 times:

4-1-1) Select 13 genes randomly from all genes in all pathways, called randGenes.

4-1-2) Find pathways which include some genes from randGenes, called randPathway.

4-2) Find the number of times which the interested pathway, selected in step (4), is included in randPathway, called n_rand.

4-3) p-value = n_rand/100.

We do step (4-1) to (4-3) for all pathways in pathway_AB, so we get p-value for each of them. Finally, the list B is functionally related to list A, if there is a pathway with p-value <  0.05.

Thank you!

Why do you only have 13 genes to start with for each A and B? Secondly, you don't seem account for genes which take place in more than one pathway, or have overlap.