What tool should I use to identify the ancestry (e.g EUR or TSI or CEU) of a single human using genotyped VCF file?
5.4 years ago
I have genotyped a single person from Platinum Genome dataset of Illumina. I want to identify the ancestry of this person (NAXXXXXX) using his/her genotyped VCF file and population panel data (e.g. EUR or TSI or CEU) from hapmap. What tool should I use to do that? Any literature explaining the method to do it will also be very helpful for me. Please suggest...

5.4 years ago

I'd download the HapMap data from PLINK, merge that with your vcf data using PLINK2 --merge, then run any population stratification program on that and see where your single data point clusters -

I can recommend fastSTRUCTURE since that one uses PLINK's bed as input, or EIGENSTRAT which gives you a nice PCA plot but is a bit harder to run.

Alternatively, you could just get a list of ancestry-related SNPs ("Ancestry Informative Markers", like here: http://www.nature.com/ejhg/journal/v22/n10/full/ejhg20141a.html ) and manually compare your SNPs, that's easier but a bit more messy.


