Why Wilcox.test keep giving (p-val = 0.1) even when there is significant difference between the data?
2
0
Entering edit mode
4.2 years ago
Raheleh ▴ 260

Hello all, I did ssGSEA on my gene expression profile and then as the ssGSEA score are so big I used ssGSEA.normalize() in multiGSEA package in r to normalize them. Then I drew boxplot using ggplot2 to see if there is significant different between my 2 groups. This is part of my data:

HALLMARK_PANCREAS   group
1         0.6357577 control
2         0.6139007 control
3         0.6221403 control
4         0.8393437   hROBO
5         0.8703753   hROBO
6         0.8530723   hROBO

this is the script to draw boxplot i r:

g <- ggplot(df, aes(x = group, y = HALLMARK_PANCREAS, color = group))+geom_boxplot()+geom_quasirandom(width=.1)+ labs(y = "ssGSEA score")+theme_bw()+ggtitle("HALLMARK_PANCREAS") +theme(legend.position = "none", plot.title = element_text(hjust = 0.5), axis.title.x=element_blank())
#add p-value
add_pval(g, pairs = list(c(1, 2)), test='wilcox.test')

this is the output:

enter image description here

I also check other ways to see if the difference is significant. For example this one:

g+geom_signif(comparisons = list(c("control", "hROBO")), 
              map_signif_level=TRUE, color = "black")

this is output:

enter image description here

I am wondering why this difference is not significant based on wilcoxon test?! Even when I change the data and increase the difference between two group's values; still keep giving me p =0.1 !! Could anyone please help me? I am really confused. Am I doing sth wrong?

I really appreciate any help!

wilcox.test ssGSEA significant level ggplot • 2.2k views
ADD COMMENT
0
Entering edit mode

You can try running the test manually to see how the different inputs impact the results. For example:

wilcox.test(c(.635, .613, .622), c(.839, .870, .853))

If you add more replicates, you'll see the p-value quickly decrease.

ADD REPLY
7
Entering edit mode
4.2 years ago
IP ▴ 760

With a sample size of 3 you don't have enough statistical power to detect differences.

You can learn about power here

ADD COMMENT
3
Entering edit mode
4.2 years ago
Constantine ▴ 290

Because wilcox.test requires a sample size n>10. In your case, you only have 3 samples per condition. Therefore, a t.test is more suitable.

ADD COMMENT
0
Entering edit mode

I don't know if we can safely assume that the data is normally distributed.

ADD REPLY
0
Entering edit mode

Thanks! I thought quite other way around. Another concern of mine is: Shouldn't have we the normal distribution of data when we are using a parametric statistical test such as t Test????

ADD REPLY
0
Entering edit mode

This is true. But as igor pointed out above you need more replicates for a non-parametric test.

ADD REPLY

Login before adding your answer.

Traffic: 1507 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6