Question

Best Statistic To Compare Two Fst Distributions?

0

Entering edit mode

10.6 years ago

confusedious ▴ 470

I am encountering difficulty selecting a statistic to compare two distributions of Weir & Cockerham's Fst. While I know the distribution is non-normal for the two Fst distributions I am looking at, I cannot say they are independent. One is a comparison between populations A and B, and the other is between populations A and C - as you can see both involve data from population A.

I was wondering whether a Wilcoxon signed-rank test or Kolmogorov-Smirnov test would be appropriate for testing whether the two distributions significantly differ?

If not, could anyone suggest a better statistic? The samples are too large (millions of markers) for me to do this via bootstrapping.

Thank you for any help you may be able to offer.

statistics fst • 5.0k views

ADD COMMENT • link 10.6 years ago by confusedious ▴ 470

2

Entering edit mode

Most of the tests out there will reject the null (same distribution), in my experience. Can you tell us more about the biological question you are trying to get at? Maybe there is another way to think about it ... perhaps you don't need to test if the distributions are different, but test for a correlation?

ADD REPLY • link 10.6 years ago by Zev.Kronenberg 12k

0

Entering edit mode

In brief, I have produced three pairwise Fst distributions for ~1 million SNPs between three human population samples. I would like to know whether the distributions are different from each other in a statistical sense, particularly between A vs. B and A vs. C. These two distributions look slightly different when eyeballing plots, but I would like to be able to quantify the difference. Would a test for correlation be more appropriate? Would there be one you would suggest?

ADD REPLY • link 10.6 years ago by confusedious ▴ 470

1

Entering edit mode

Honestly, if you have so many data points, it would be weird if you did not see a difference. The interesting question is, I believe: How big is the difference? It might be worth looking into Bayesian statistics for this question. And example of comparing two distributions can be found at [http://www.indiana.edu/~kruschke/BEST/]. Yes, this is computationally more demanding than a frequential statistic, but is has much better power and the interpretability is more straight forward (in my opinion.)

ADD REPLY • link 10.6 years ago by David Westergaard ★ 1.5k