Question: Why Wilcox.test keep giving (p-val = 0.1) even when there is significant difference between the data?
0
gravatar for Rahil
7 months ago by
Rahil170
Rahil170 wrote:

Hello all, I did ssGSEA on my gene expression profile and then as the ssGSEA score are so big I used ssGSEA.normalize() in multiGSEA package in r to normalize them. Then I drew boxplot using ggplot2 to see if there is significant different between my 2 groups. This is part of my data:

HALLMARK_PANCREAS   group
1         0.6357577 control
2         0.6139007 control
3         0.6221403 control
4         0.8393437   hROBO
5         0.8703753   hROBO
6         0.8530723   hROBO

this is the script to draw boxplot i r:

g <- ggplot(df, aes(x = group, y = HALLMARK_PANCREAS, color = group))+geom_boxplot()+geom_quasirandom(width=.1)+ labs(y = "ssGSEA score")+theme_bw()+ggtitle("HALLMARK_PANCREAS") +theme(legend.position = "none", plot.title = element_text(hjust = 0.5), axis.title.x=element_blank())
#add p-value
add_pval(g, pairs = list(c(1, 2)), test='wilcox.test')

this is the output:

enter image description here

I also check other ways to see if the difference is significant. For example this one:

g+geom_signif(comparisons = list(c("control", "hROBO")), 
              map_signif_level=TRUE, color = "black")

this is output:

enter image description here

I am wondering why this difference is not significant based on wilcoxon test?! Even when I change the data and increase the difference between two group's values; still keep giving me p =0.1 !! Could anyone please help me? I am really confused. Am I doing sth wrong?

I really appreciate any help!

ADD COMMENTlink modified 7 months ago by Constantine280 • written 7 months ago by Rahil170

You can try running the test manually to see how the different inputs impact the results. For example:

wilcox.test(c(.635, .613, .622), c(.839, .870, .853))

If you add more replicates, you'll see the p-value quickly decrease.

ADD REPLYlink modified 7 months ago • written 7 months ago by igor11k
7
gravatar for IP
7 months ago by
IP700
Denmark/University of Copenagen
IP700 wrote:

With a sample size of 3 you don't have enough statistical power to detect differences.

You can learn about power here

ADD COMMENTlink written 7 months ago by IP700
3
gravatar for Constantine
7 months ago by
Constantine280
USA
Constantine280 wrote:

Because wilcox.test requires a sample size n>10. In your case, you only have 3 samples per condition. Therefore, a t.test is more suitable.

ADD COMMENTlink modified 7 months ago • written 7 months ago by Constantine280

I don't know if we can safely assume that the data is normally distributed.

ADD REPLYlink written 7 months ago by igor11k

Thanks! I thought quite other way around. Another concern of mine is: Shouldn't have we the normal distribution of data when we are using a parametric statistical test such as t Test????

ADD REPLYlink written 7 months ago by Rahil170

This is true. But as igor pointed out above you need more replicates for a non-parametric test.

ADD REPLYlink modified 7 months ago • written 7 months ago by Constantine280
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1613 users visited in the last hour