Question: gene enrichment analysis

0

namjoub50 •

**0**wrote:If we have a gene or TF that binds to 75% of human genes ...lets say 15000 of 20000 genes, what statistical preparation or method should I use to do a correct enrichment analyses . in this test lets say a pathway has 50 genes and 28 of them binds with this TF. I don't think fisher exact p value as described in David or GSEA is the correct approach. thanks

It is perfectly acceptable to do Fisher's exact test for enrichment analysis. What you're asking is this: Given that I sampled 15k genes out of 20k, what is the chance that I would pick 28 of the 50 genes that are members of this pathway if the 15k genes where randomly selected ?

22kWouldn't it be "chance that 28 genes would come from the 15k if the 50 were selected at random from the 20k"?

30I am not sure I follow, maybe you're saying the same thing. What the experiment is doing is select 15k balls from an urn that contains 20k balls. In the urn, you have 50 red balls and the rest of other colors and you're asking what is the chance of getting this many red balls or more among those I picked (the 15k) if I had picked them at random ?

22kAh, you're right. What I was thinking would be the chance of nature making those 50 balls red in the first place, what you say makes more sense.

30in this situation when the background is so frequent (15K binding out of a total possible of 20k), I think there should be some additional correction before fisher test...but I am not sure how...

0Fisher's exact test is valid regardless of sample size. It is used primarily for small sample sizes because the chi-squared test is seen as too inaccurate for small sample sizes.

22k