Greetings everyone,
I am tearing out my hair trying to incorporate genotype likelihoods into several population genetics measures. I asked a question on the GATK forums, but didn't get a useful response (in terms I could understand). To be honest I was put off by the help I received.
http://gatkforums.broadinstitute.org/discussion/2923/re-scaling-genotype-likelihoods
I thought I would try my question here.
The genotype likelihoods (VCF field: PL ) are altered so that the most likely genotype has a probability of one or a Phred score of zero. This is causing me trouble integrating over the other possible genotypes. Alternatively I thought I could use genotype quality (GQ) for the calculations, but it throws everything off.
How are people dealing with this? Why does Unified Genotyper set the most likely genotype to a Phred Score of Zero? How can I account for the uncertainty of best genotype if it always has a probability of 1?
Any advice would be great! Thanks everyone.
example:
GT:AD:DP:GQ:PL
1/1:0,3:3:6:57,6,0
So this is the equation I am playing with. I tried using GQ in the bottom derivation, not summing across the other genotypes. Didn't work
http://genome.sph.umich.edu/wiki/Genotype_Likelihood_based_Inbreeding_Coefficient