Question: Test Whether The Variance In A Group Is Lower Than In Another
4
gravatar for Giovanni M Dall'Olio
9.1 years ago by
London, UK
Giovanni M Dall'Olio26k wrote:

I have two groups of data (not distributed under a normal distribution): I would like to test the hypothesis that the first group has a lower (or narrower) standard deviation than the other.

An alternative explanation to this is that I would like to tell whether the first group is less 'variable', 'heterogeneous', than the first.

A kruskal-wallis won't do it because it compares the medians of two or more groups, and I am not interested in that.

A Levene or a Brown-Forsynth test compare the variance between the two groups and tell whether they have the same variance. This is better, but I would also like to tell if the variance in the first group is lower than in the other(s) group(s).

A simple Chi-Square test would tell me whether the standard deviation of a group is equal to a certain value, and the one-tailed version can tell me whether it is higher/lower.

An additional difficulty is that I would have to do this test as a two-way, because I have two grouping variables, but I would like to ask you if you can point me to any direction or give me some hint, I have not many ideas on where to search :-)

R statistics • 9.0k views
ADD COMMENTlink modified 9.1 years ago by Alastair Kerr5.2k • written 9.1 years ago by Giovanni M Dall'Olio26k

What is your non-normality assumption based on? Have you thought about transforming the data (with log transformation, for example) to be more normal?

ADD REPLYlink written 9.1 years ago by Yuri1.5k

You might also want to ask that question on stats.stackexchange.com. It's populated by lots of true-blooded statisticians who eat this stuff for breakfast.

ADD REPLYlink written 8.6 years ago by David Quigley11k

Hi Giovanni, how did you end up solving this? I ran into a very similar problem.

ADD REPLYlink written 8 weeks ago by A. Domingues2.0k
8
gravatar for Matt Parker
9.1 years ago by
Matt Parker80
Denver, CO
Matt Parker80 wrote:

Is bootstrapping a possibility? Resample from your data, calculate the variance, repeat. This should leave you with a vector of bootstrapped variance estimates for each of your desired groups. Perform the appropriate test on those estimates (e.g., t-test if you're comparing two groups and the estimates turn out normally).

I think the boot package is the norm for resampling in R, but here's some untested code to clarify the idea:

n <- 1000
x <- rnorm(mean = 0, sd = 1, n = n)
y <- rnorm(mean = 0, sd = 1.1, n = n)

nboots <- 10000
bootvar.x <- vector(mode = "numeric", length = nboots)
bootvar.y <- vector(mode = "numeric", length = nboots)

for(i in seq_len(nboots)){
  bootvar.x[i] <- var(sample(x, size = n, replace = TRUE))
  bootvar.y[i] <- var(sample(y, size = n, replace = TRUE))
}

require(ggplot2)
#Probably a better way to do this
bootvar.x2 <- data.frame(var = bootvar.x, group = "x")
bootvar.y2 <- data.frame(var = bootvar.y, group = "y")
bootvars <- rbind(bootvar.x2, bootvar.y2)

ggplot(bootvars, aes(x = var, group = group, colour = group)) + geom_density()

t.test(bootvar.x, bootvar.y)

Disclaimer: I've read a bit about bootstrapping. Please don't assume I actually know anything. This is just a suggestion for something to check out.

ADD COMMENTlink modified 7 months ago by RamRS21k • written 9.1 years ago by Matt Parker80
3
gravatar for hurfdurf
8.6 years ago by
hurfdurf460
United States
hurfdurf460 wrote:

If this data is really non-normal, should you be using variance or standard deviation at all?

You might want to use more robust metrics like [?]median absolute deviation[?].

ADD COMMENTlink written 8.6 years ago by hurfdurf460
2
gravatar for Michael Dondrup
9.1 years ago by
Bergen, Norway
Michael Dondrup46k wrote:

Look for the F-test or Bartlett's test. As your data is non-normal you need something more robust against deviation from normality. Leven's test is for example mentioned as an alternative

ADD COMMENTlink written 9.1 years ago by Michael Dondrup46k

thanks, I forgot to say that I also looked at the Bartlett's test, but discarded it because it is sensitive to departures from normality and my data is not normal. Thanks anyway.

ADD REPLYlink written 9.1 years ago by Giovanni M Dall'Olio26k

Then Forsythe test maybe? http://en.wikipedia.org/wiki/Brown%E2%80%93Forsythe_test maybe. Look at the section: "Comparison with Levene's test"

ADD REPLYlink written 9.1 years ago by Michael Dondrup46k
2
gravatar for Alastair Kerr
8.6 years ago by
Alastair Kerr5.2k
The University of Edinburgh, UK
Alastair Kerr5.2k wrote:

Another alternative:

Transform the data by subtracting the mean (or median) from each data point and take the absolute values.

Now check the normality of each sample again and use a t-test or KS test as appropriate.

ADD COMMENTlink written 8.6 years ago by Alastair Kerr5.2k
1
gravatar for Jarretinha
9.1 years ago by
Jarretinha3.3k
São Paulo, Brazil
Jarretinha3.3k wrote:

You can try a Friedman test at first for each factor (assuming they're independent) and, given that really there is some difference, proceed an adequate multiple hypothesis testing using Bonferroni method, for example. Not a sequential hypothesis testing like we usually do with microarray data. You'll need to specifiy all concurrent hypothesis (variance =, <, >) and significance/power levels.

I don't know much about your experimental/test design. You could furnish additional detais.

ADD COMMENTlink modified 9.1 years ago • written 9.1 years ago by Jarretinha3.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 692 users visited in the last hour