I have a collection of SNP frequencies from several years ago to present, and I'm trying to look at what's drifting and what's actively being selected on. To do this I'm trying to get the P value of just how much the SNP changes over the course of the few years I'm looking at. I'm using
to get the variance of genetic drift over t generations. Using this as variance and the starting frequency of t0 as mu, I'm getting the alpha and beta values for a beta distribution through moment matching with these values and then applying that and tn to pbeta() to get the significance of the gene's change. I figure that most of the p values should be around .5 since there shouldn't be too much change (my t/2Ne is very small), but I'm getting about 50% of my p values > .95 and < .05.
Here's what I'm doing in R
#moment matching function
estBetaParams <- function(mu, var) {
alpha <- ((1 - mu) / var - 1 / mu) * mu ^ 2
beta <- alpha * (1 / mu - 1)
return(params = list(alpha = alpha, beta = beta))
}
#get the variance of expected drift
snp$Drift.variance <- snp$yr.0*(1-snp$yr.0)*(1-exp(-(10/(2*500))))
#get the p values of how much the SNP has changed
snp$Drift.p <- {
b <- estBetaParams(mu = snp$yr.0, var = snp$Drift.variance) #yr.10 value should be beta distributed around yr.0
pbeta(snp$yr.10, shape1 = b[1], shape2 = b[2])
}