Question: Ancestry Informative Markers
gravatar for Inquisitive8995
13 months ago by
Inquisitive899540 wrote:

Hi, How do I find the Ancestry Informative Makers in the 1000 Genomes Project data. Is there a tool to identify the AIMs in all the given populations? Any help would be appreciated. Thanks in Advance.

ADD COMMENTlink modified 5 months ago by Shicheng Guo7.3k • written 13 months ago by Inquisitive899540
gravatar for Shicheng Guo
5 months ago by
Shicheng Guo7.3k
Shicheng Guo7.3k wrote:

Ancestry Informative Markers

African American vs European Ancestry We selected a grid of 3,388 markers (distributed approximately one per megabase, across the autosomes and the X chromosome) that showed strong differentiation between African- and European-ancestry samples sequenced by the 1000 Genomes Project. Markers previously genotyped on the Illumina Omni 2.5M array were favored and markers with A/T or G/C alleles were avoided.

Native American vs European Ancestry A grid of 1,000 markers selected to be informative for Native American vs. European ancestry. These AIMs were selected to be in low linkage disequilibrium of one another (defined as R2 <= 0.1 in Native American populations, to be conservative) and widely separated (by requiring that they should be at least 250 kbases from other European vs Native American ancestry AIMs). SNPs with significant within continent heterogeneity were excluded.

These markers were previously genotyped in three samples of European ancestry (consisting of CEU and TSI samples and a population of Spaniards) and six samples of Native Americans ( Mayan, Nahuan, Zapoteca, Tepehuano, Quechuan and Aymaran).

ADD COMMENTlink written 5 months ago by Shicheng Guo7.3k
gravatar for Kevin Blighe
13 months ago by
Kevin Blighe33k
Republic of Ireland
Kevin Blighe33k wrote:


Yes, I have built my own predictive model based on the 1000 Genomes Data - it has 99% sensitivity/specificity.

Take a look at my tutorial here: Produce PCA for 1000 Genomes Phase III in VCF format

If you get through that, it would be a great start toward building your own model.


ADD COMMENTlink written 13 months ago by Kevin Blighe33k

Hi Kevin, Sorry about the delay in responding. Thanks for your reply. I understand from your link that you are using PCA to create the model. I don't exactly get the point where AIMs can be obtained from the process. It would be really helpful if you could explain it to me a little. Thanks in Advance.

ADD REPLYlink written 13 months ago by Inquisitive899540

The PCA bi-plot will have been based on markers that segregate the different 1000 Genome populations. So, you then take these markers and test them through regression modelling to see which ones are the best at segregating each group. At the end of the day, weather forecasting, predicting the stock markets, predicting ethnicity, et cetera are all based on modelling and then predicting.

ADD REPLYlink written 13 months ago by Kevin Blighe33k

Okay. Got it ! Thanks a lot

ADD REPLYlink written 13 months ago by Inquisitive899540
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 824 users visited in the last hour