Question: Statistical test for overlap
0
gravatar for pixie@bioinfo
3 months ago by
pixie@bioinfo1.4k
Université Paris, Saclay
pixie@bioinfo1.4k wrote:

Hello, I have a venn diagram with a list of up and down-regulated genes from experiment 1. I compared another gene list (from experiment 2) and got the overlap (108 genes are Up, 309 genes are Down and 41 genes are not deferentially expressed).

My NULL hypothesis is that the proportion of overlaps with up and down-regulated genes is 50-50 (that is random). I want to show that the proportion of down-regulated genes (as shown in the pie chart) is significant. What kind of statistical test should I do here ? Thanks.

enter image description here

statistics • 183 views
ADD COMMENTlink modified 3 months ago by H.Hasani970 • written 3 months ago by pixie@bioinfo1.4k
1
gravatar for e.rempel
3 months ago by
e.rempel890
Germany, Heidelberg, COS
e.rempel890 wrote:

Hi,

if I understood your question correctly, I would use the binomial test. Let me explain.

There are 417 genes in the overlap between Exp2 and Exp1 (Up and Down combined). Your NULL suggests that these genes are distributed between Up and Down with probability 0.5 for each subset (as you said, 50 - 50). That means in the lingo of the binomial test, you have 419 number of trials, 309 number of successes and probability of success equals 0.5. Thus, the way to compute binomial test in R would be

binom.test(x = 309, n = 417, p = 0.5)

I obtained p-value less than 2.2e-16.

HTH

ADD COMMENTlink written 3 months ago by e.rempel890

It should be p = 5224/(5646+5224) instead of 0.5 .

ADD REPLYlink written 3 months ago by Asaf8.4k

In this case shouldn't it be p = (5224 + 309)/(5224 + 309 + 5646 + 108) ? :)

ADD REPLYlink written 3 months ago by e.rempel890

Oh yeah, right. Another reason to not use Venn diagrams :)

ADD REPLYlink written 3 months ago by Asaf8.4k

You can overcome R 2.2e-16 limit with binom.test(...)$p.value. Per your numbers p-value is 1.6e-23.

ADD REPLYlink modified 3 months ago • written 3 months ago by jomo018610
1
gravatar for H.Hasani
3 months ago by
H.Hasani970
Freiburg, Germany
H.Hasani970 wrote:

I would use proportion test. As the name says, the null hypothesis is that the proportion in each set is the same. It helps you answer questions like do we have more male proportion in group A compared to female proportion in group B (test for two proportions); or if male proportion in the group is similar/more/less in the entire population (test for one proportion)

ADD COMMENTlink modified 3 months ago • written 3 months ago by H.Hasani970
1
gravatar for Asaf
3 months ago by
Asaf8.4k
Israel
Asaf8.4k wrote:

I think you can ask better questions like how is the distribution of LFC of genes in the group compared to genes outside the group, try plotting the LFC distribution in violin plot for instance of the two groups (in Exp2 and not in Exp2) or MA plot but color according to Exp2 or not, it will present much more than you chose to present and test (by the way, neither Euler graph nor pie chart are good choices for presenting data, there are better alternatives).

ADD COMMENTlink written 3 months ago by Asaf8.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 631 users visited in the last hour