I have imputed a large dataset with mach / minimac and that resulted in 9 TB of data. The next step of my analysis is to compute the r square hat metric on order to assess the quality of the imputation. (I know that minimach contains it's own quality metric, but I am interested in r square hat particularly).
Compute r−squared-hat for each SNP, which is the (estimated) fraction of variance in unobserved 0/1/2 genotype explained by the the individual mean genotypes. (We assume this is the same deﬁnition used by Abecasis et al.)
The problem is that in order to run quicktest I will have to convert 9TB of data to the QuickTest format. To save this hassle I would prefer to write a script to calculate this metric. So, does anyone know how to compute the "r squared hat" from dosage data (or from imputation a-posteriori probabilities) ? All I am asking is a formula that will take as inputs imputation dosage data from a single SNP and will estimate the "r square hat" metric.
Thanks a lot!