Question: Fisher's exact test gives p-value 0
0
gravatar for Adrian Pelin
3.1 years ago by
Adrian Pelin2.3k
Canada
Adrian Pelin2.3k wrote:

Hello,

I have a similar situation described in this post Hypergeometric Test On Gene Set

I have 2 microarrays on 2 different conditions which give me 2 different gene sets of differential expressed transcripts.

Diff in Condition 1: 738

Diff in Condition 2: 1090

Overlap Condition 1 & 2: 453

Total Genes in array: 30941

I want to test the significance of the overlap between the 2 conditions. I use:

phyper(452, 738, 30203, 1090, lower.tail=FALSE)

[1] 0

Any idea why the p-value is 0? I tried based on this post "http://stats.stackexchange.com/questions/16247/calculating-the-probability-of-gene-list-overlap-between-an-rna-seq-and-a-chip-c"

phyper=(overlap,list1,PopSize-list1,list2,lower.tail = FALSE)

Thanks

ADD COMMENTlink written 3.1 years ago by Adrian Pelin2.3k

You should try using log=TRUE

ADD REPLYlink written 3.1 years ago by russhh4.6k

I get:

phyper(452, 738, 30203, 1090, lower.tail=FALSE, log.p = TRUE) [1] -1140.21

Any idea what what means? p.value = 1E-1140 ?

ADD REPLYlink written 3.1 years ago by Adrian Pelin2.3k

e^-1140.21, since log is natural log here.

ADD REPLYlink written 3.1 years ago by Devon Ryan91k

That number is still 0 when using any calculator. My question is, why is the p-value so low? The overlap is not that great, it is ~50-70% of genes. Is the 2x2 table constructed correctly?

ADD REPLYlink written 3.1 years ago by apelin20470
5

You're calculating the probability of the following scenario:

  • You have a jar of 30203 black balls and 738 white balls
  • You draw 1090 of them randomly without replacement
  • You count the number of white balls you have drawn and it is equal to 452
  • The probability of drawing greater than 452 white balls given your conditions is virtually zero
  • Inversely, the probability of drawing fewer than 452 white balls given your conditions is virtually one

In a jar where ~ 2% of the balls are white, it would be extraordinarily rare to draw 50-70% of them being white by chance alone, which is why your p-value is so low.

ADD REPLYlink written 3.1 years ago by Steven Lakin1.4k
1

The overlap is not that great, it is ~50-70% of genes

That's why I think p-values in genomics are often meaningless. You get very small p-values even if the effect size is small and this is a consequence of the large of data-sets available (thousands of genes, millions of SNPs etc.). By the way, I wouldn't say ~50-70% is a small overlap...

ADD REPLYlink written 3.1 years ago by dariober10k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 943 users visited in the last hour