Entering edit mode
6 hours ago
18d10fee
•
0
Hi everyone,
I’m running McDonald–Kreitman (MK) tests across several thousand genes to estimate alpha (the proportion of adaptive substitutions).
After filtering out genes with zero values for Dn, Ds, Pn, or Ps, I still observe the following pattern:
- ~80% of genes are not significant (p > 0.05)
- Among the significant ones, ~60% have positive alpha and ~40% negative alpha
- Some alpha values are highly negative (e.g. –24)
- Alignments are codon-based and appear fine upon inspection
- Polymorphism frequency threshold = 0.1
I expected a stronger signal of positive selection overall (especially in sex-biased genes), but instead I see a predominance of non-significant and negative results.
My questions:
- Is this distribution of alpha (many insignificant, some strongly negative) normal for large-scale MK datasets?
- Could alignment quality or population grouping errors produce such negative alpha values?
- Are there known biases (e.g., low polymorphism, slightly deleterious variants, demography) that could explain this pattern?
Any insights or experiences with large MK test datasets or codon-based alignments would be greatly appreciated.
Thanks!