Question: How to identify the ancestral population to which an individual belongs knowing its allele frequencies?
0
gravatar for cyril-cros
2.8 years ago by
cyril-cros840
France
cyril-cros840 wrote:

Hi, I have been given as part as a homework a set of genotyped human SNP for many individuals; I am supposed to identify their ancestral populations (disclaimer: as a bonus question I won't submit and which is likely an example from the readings I have to do, so I am not asking for a key in hand solution). Using STRUCTURE, I can get an estimate of the number of ancestral populations of my sample and their allele frequencies.

My SNPs are coded by 0 or 1 (ancestral or derived states), but I know their labels (rsXXXXX). Some SNP are found in one state at a much higher frequency in some population, which makes them informative: I can can manually crosscheck my frequency estimates with the population frequencies found in HapMap.

For example, with SNP rs924201 , populations A and C present the derived state in around 50% of cases and B the ancestral in more than 80%. HapMap tells me that ~80% of Africans share a same variant of this SNP , but only roughly 50% of European and Asians. I can guess that B would likely be from Africa, given the fact that my sample is rather large.

Is there a way to fit my imputed allele frequencies to real world frequencies? Like, finding all markers with noticeably different frequencies between my predicted groups and matching them to real world populations. I would need a way to retrieve the frequencies of the most common variants for each of the HapMap populations. I could maybe then try some kind of lasso on the frequencies of the most common allele, in order to know which real world population is the closest of my predicted ancestral population.

Thanks for your advice!

snp structure infer ancestry • 1.4k views
ADD COMMENTlink modified 2.8 years ago by stolarek.ir550 • written 2.8 years ago by cyril-cros840
0
gravatar for stolarek.ir
2.8 years ago by
stolarek.ir550
Poland
stolarek.ir550 wrote:

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3665925/

Read on. STRUCTURE is all you need for homework. You could also use somewhat visual way with PCA (eigensoft)

ADD COMMENTlink written 2.8 years ago by stolarek.ir550
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1537 users visited in the last hour