Question: Test Whether The Variance In A Group Is Lower Than In Another
4
gravatar for Giovanni M Dall'Olio
10.3 years ago by
London, UK
Giovanni M Dall'Olio27k wrote:

I have two groups of data (not distributed under a normal distribution): I would like to test the hypothesis that the first group has a lower (or narrower) standard deviation than the other.

An alternative explanation to this is that I would like to tell whether the first group is less 'variable', 'heterogeneous', than the first.

A kruskal-wallis won't do it because it compares the medians of two or more groups, and I am not interested in that.

A Levene or a Brown-Forsynth test compare the variance between the two groups and tell whether they have the same variance. This is better, but I would also like to tell if the variance in the first group is lower than in the other(s) group(s).

A simple Chi-Square test would tell me whether the standard deviation of a group is equal to a certain value, and the one-tailed version can tell me whether it is higher/lower.

An additional difficulty is that I would have to do this test as a two-way, because I have two grouping variables, but I would like to ask you if you can point me to any direction or give me some hint, I have not many ideas on where to search :-)

R statistics • 11k views
ADD COMMENTlink modified 10.3 years ago by Alastair Kerr5.2k • written 10.3 years ago by Giovanni M Dall'Olio27k

What is your non-normality assumption based on? Have you thought about transforming the data (with log transformation, for example) to be more normal?

ADD REPLYlink written 10.3 years ago by Yuri1.5k

You might also want to ask that question on stats.stackexchange.com. It's populated by lots of true-blooded statisticians who eat this stuff for breakfast.

ADD REPLYlink written 9.8 years ago by David Quigley11k

Hi Giovanni, how did you end up solving this? I ran into a very similar problem.

ADD REPLYlink written 16 months ago by A. Domingues2.2k
8
gravatar for Matt Parker
10.3 years ago by
Matt Parker80
Denver, CO
Matt Parker80 wrote:

Is bootstrapping a possibility? Resample from your data, calculate the variance, repeat. This should leave you with a vector of bootstrapped variance estimates for each of your desired groups. Perform the appropriate test on those estimates (e.g., t-test if you're comparing two groups and the estimates turn out normally).

I think the boot package is the norm for resampling in R, but here's some untested code to clarify the idea:

n <- 1000
x <- rnorm(mean = 0, sd = 1, n = n)
y <- rnorm(mean = 0, sd = 1.1, n = n)

nboots <- 10000
bootvar.x <- vector(mode = "numeric", length = nboots)
bootvar.y <- vector(mode = "numeric", length = nboots)

for(i in seq_len(nboots)){
  bootvar.x[i] <- var(sample(x, size = n, replace = TRUE))
  bootvar.y[i] <- var(sample(y, size = n, replace = TRUE))
}

require(ggplot2)
#Probably a better way to do this
bootvar.x2 <- data.frame(var = bootvar.x, group = "x")
bootvar.y2 <- data.frame(var = bootvar.y, group = "y")
bootvars <- rbind(bootvar.x2, bootvar.y2)

ggplot(bootvars, aes(x = var, group = group, colour = group)) + geom_density()

t.test(bootvar.x, bootvar.y)

Disclaimer: I've read a bit about bootstrapping. Please don't assume I actually know anything. This is just a suggestion for something to check out.

ADD COMMENTlink modified 22 months ago by RamRS27k • written 10.3 years ago by Matt Parker80
3
gravatar for hurfdurf
9.8 years ago by
hurfdurf460
United States
hurfdurf460 wrote:

If this data is really non-normal, should you be using variance or standard deviation at all?

You might want to use more robust metrics like [?]median absolute deviation[?].

ADD COMMENTlink written 9.8 years ago by hurfdurf460
2
gravatar for Michael Dondrup
10.3 years ago by
Bergen, Norway
Michael Dondrup47k wrote:

Look for the F-test or Bartlett's test. As your data is non-normal you need something more robust against deviation from normality. Leven's test is for example mentioned as an alternative

ADD COMMENTlink written 10.3 years ago by Michael Dondrup47k

thanks, I forgot to say that I also looked at the Bartlett's test, but discarded it because it is sensitive to departures from normality and my data is not normal. Thanks anyway.

ADD REPLYlink written 10.3 years ago by Giovanni M Dall'Olio27k

Then Forsythe test maybe? Look at the section: "Comparison with Levene's test"

ADD REPLYlink modified 10 months ago by RamRS27k • written 10.3 years ago by Michael Dondrup47k
2
gravatar for Alastair Kerr
9.8 years ago by
Alastair Kerr5.2k
Manchester/UK/Cancer Biomarker Centre at CRUK-MI
Alastair Kerr5.2k wrote:

Another alternative:

Transform the data by subtracting the mean (or median) from each data point and take the absolute values.

Now check the normality of each sample again and use a t-test or KS test as appropriate.

ADD COMMENTlink written 9.8 years ago by Alastair Kerr5.2k
1
gravatar for Jarretinha
10.3 years ago by
Jarretinha3.3k
São Paulo, Brazil
Jarretinha3.3k wrote:

You can try a Friedman test at first for each factor (assuming they're independent) and, given that really there is some difference, proceed an adequate multiple hypothesis testing using Bonferroni method, for example. Not a sequential hypothesis testing like we usually do with microarray data. You'll need to specifiy all concurrent hypothesis (variance =, <, >) and significance/power levels.

I don't know much about your experimental/test design. You could furnish additional detais.

ADD COMMENTlink modified 10.3 years ago • written 10.3 years ago by Jarretinha3.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 713 users visited in the last hour