I'm looking to understand how best to calculate estimated ethnicity from a sample VCF. That is to take a VCF file and estimate that the person that the file came from is 80% caucasian and 20% asian. At least to the level of the 5 super-populations of the 1000 genomes project, and even better if to the 26 sub-populations of the 1000 genomes project.
I've read about approaches using BEAGLE and other tools that do this well for analyzing a set of VCF - I'm not sure if that is helpful here as I am interested in something that could perform this analysis on a new sample without rerunning it on the entire set.
Does anyone have any pointers?