Question

Why Wilcox.test keep giving (p-val = 0.1) even when there is significant difference between the data?

0

Entering edit mode

4.2 years ago

Raheleh ▴ 260

Hello all, I did ssGSEA on my gene expression profile and then as the ssGSEA score are so big I used ssGSEA.normalize() in multiGSEA package in r to normalize them. Then I drew boxplot using ggplot2 to see if there is significant different between my 2 groups. This is part of my data:

HALLMARK_PANCREAS   group
1         0.6357577 control
2         0.6139007 control
3         0.6221403 control
4         0.8393437   hROBO
5         0.8703753   hROBO
6         0.8530723   hROBO

this is the script to draw boxplot i r:

g <- ggplot(df, aes(x = group, y = HALLMARK_PANCREAS, color = group))+geom_boxplot()+geom_quasirandom(width=.1)+ labs(y = "ssGSEA score")+theme_bw()+ggtitle("HALLMARK_PANCREAS") +theme(legend.position = "none", plot.title = element_text(hjust = 0.5), axis.title.x=element_blank())
#add p-value
add_pval(g, pairs = list(c(1, 2)), test='wilcox.test')

this is the output:

enter image description here

I also check other ways to see if the difference is significant. For example this one:

g+geom_signif(comparisons = list(c("control", "hROBO")), 
              map_signif_level=TRUE, color = "black")

this is output:

enter image description here

I am wondering why this difference is not significant based on wilcoxon test?! Even when I change the data and increase the difference between two group's values; still keep giving me p =0.1 !! Could anyone please help me? I am really confused. Am I doing sth wrong?

I really appreciate any help!

wilcox.test ssGSEA significant level ggplot • 2.2k views

ADD COMMENT • link updated 4.2 years ago by Constantine ▴ 290 • written 4.2 years ago by Raheleh ▴ 260

0

Entering edit mode

You can try running the test manually to see how the different inputs impact the results. For example:

wilcox.test(c(.635, .613, .622), c(.839, .870, .853))

If you add more replicates, you'll see the p-value quickly decrease.

ADD REPLY • link 4.2 years ago by igor 13k

score 7 · Answer 1 · 2020-02-09

7

Entering edit mode

4.2 years ago

IP ▴ 760

With a sample size of 3 you don't have enough statistical power to detect differences.

You can learn about power here

ADD COMMENT • link 4.2 years ago by IP ▴ 760

score 3 · Answer 2 · 2020-02-09

3

Entering edit mode

4.2 years ago

Constantine ▴ 290

Because wilcox.test requires a sample size n>10. In your case, you only have 3 samples per condition. Therefore, a t.test is more suitable.

ADD COMMENT • link 4.2 years ago by Constantine ▴ 290

0

Entering edit mode

I don't know if we can safely assume that the data is normally distributed.

ADD REPLY • link 4.2 years ago by igor 13k

0

Entering edit mode

Thanks! I thought quite other way around. Another concern of mine is: Shouldn't have we the normal distribution of data when we are using a parametric statistical test such as t Test????

ADD REPLY • link 4.2 years ago by Raheleh ▴ 260

0

Entering edit mode

This is true. But as igor pointed out above you need more replicates for a non-parametric test.

ADD REPLY • link 4.2 years ago by Constantine ▴ 290