6.4 years ago by
New Zealand
The paper Zev links to provides a very good intro to this field.
I thought I'd just that the specific statistics you mention, Tajima's (pi) and Watterson's estimators of theta, form the basis of Tajima's D.
Briefly. The idea is that if a gene has been subject to directional selection (i.e. positive or negative selection) those variants are present will be at low frequency so nucleotide diversity will be low relative to Watterson's theta (which is based only on the number of segregating sites). A positive value for D would suggest balancing selection (maintaining an excess of medium-frequency alleles). BUT, Tajima's D is also affected by demography, since population expansion also leads to an excess of rare alleles.
As Zev's paper describes, there are a whole suite of measures that are more or less sensitive to different demographic and population genetic processes.
I'm not aware of a test that compares Pi_non-syn with Pi_syn, though some tests like McDonald Krietman include those values along with divergence stats.
•
link
modified 6.4 years ago
•
written
6.4 years ago by
David W ♦ 4.8k