Hypothesis Test For Three Groups
2
0
Entering edit mode
11.8 years ago
lyz10302012 ▴ 470

Hi everyone,

If I have three groups A , B and C, A and B are from the same distribution, while C is not, which test can I use to test the significance? Should I test every two of them and get all p values, then combine all the p values to get a new one, by which methods? Or do you have any other idea?

Thanks

test • 4.1k views
0
Entering edit mode

Can you be more clear with us? What kind of data you have?

0
Entering edit mode

Assume every group follow the gaussian distribution, but with different parameters. t test can be used to test every two groups.

2
Entering edit mode
11.8 years ago
David W 4.9k

Lyz,

It's not entirely clear what your asking. Between your question and the comment is sounds like you have three groups with (potentially) different means but each normally distributed. So, in R something like

set.seed(123)
ex_data <- data.frame(grp=rep(letters[1:3], each=20),
vals=c(rnorm(n=20, mean=0, sd=1), rnorm(20, 2, 1), rnorm(20,0,1)))
#mean values for each group
with(ex_data, tapply(vals, grp, mean))
#        a         b         c
#0.1416238 1.9487428 0.1064852


Now you want to kow if the between-group differences are statically significant? If that's the case then you want to ANOVA and a correction to deal with the fact you are making multiple comparisons:

ex_data.aov <- aov(vals ~ grp, data=ex_data)
TukeyHSD(ex_data.aov)
#  Tukey multiple comparisons of means
#   95% family-wise confidence level
#
#Fit: aov(formula = vals ~ grp, data = ex_data)
#
#\$grp
#           diff        lwr        upr     p adj
#b-a  1.80711904  1.1053442  2.5088939 0.0000002
#c-a -0.03513857 -0.7369134  0.6666362 0.9920290
#c-b -1.84225761 -2.5440324 -1.1404828 0.0000001


If you mean the variances of your group data are different then you need to something a little - maybe Welch's ANOVA oneway.test() in R, or a bootstrap approach.

In any case, this will all be covered in very introductory stats text. Make sure you understand what the numbers coming back from these tests actually mean before you report them

0
Entering edit mode

Thanks David. Your answer is exactly what I want. Can I use a single p-value instead of three p values to illustrate this condition. For exmaplel, max(b-a, b-c) is a kind of method. Is there any other methods to illustrate.

0
Entering edit mode

If the question you are asking is "is it likely all these 3 groups come from a single underlying distribution, i.e. not differences between groups" then you want the p-value of the ANOVA as a whole. If, on the other hand, you are interested about between-group differences you need some sort of post-hoc test.

That being said, I'd focus more on the effect size (difference) and confidence intervals than the p-values.

0
Entering edit mode
11.8 years ago
matted 7.8k

Your question as written isn't very clear, but as I understand it you should simply pool groups A and B and then test the combined population against C.

If you know or assume A and B come from the same distribution and you're not allowing any differences between them, there's no reason to separate them in any way.

Feel free to add more detail to the question (and maybe a specific example?) if this interpretation is incorrect.

0
Entering edit mode

Thanks matted. I want to test the datasets to illustrate that A, B are from the same, while C is not.