Question

Dealing With Population Stratification In Gwas

1

Entering edit mode

12.0 years ago

Nasir ▴ 270

My question is a bit 'general' and I would be very grateful for any advice. I have data from a GWAS on unrelated Western European patients with a sporadic disease. Is these a 'gold standard' way of dealing with population stratification/substructure/ancestry in the GWAS QC/analysis? If not, there a leading or widely used and accepted way of dealing with this? The way I have done this in the past is to use eigensoft/smartpca to build a principal component model using HapMap genotype data from Europe (CEU), Asia (CHB + JPT) and Africa (YRI), then clustering my samples alongside the HapMap samples and excluding outliers 'by eye'. I'm sure there will be better approaches available. Could people please suggest any approaches. If someone could direct me to an online step-by-step tutorial, if available, that would be much appreciated!

Nasir

pca • 5.2k views

ADD COMMENT • link updated 12.0 years ago by Larry_Parnell 16k • written 12.0 years ago by Nasir ▴ 270

1

Entering edit mode

You should try STRUCTURE (http://pritch.bsd.uchicago.edu/structure.html). I prefer their graphical output and you get an output file with the percentage of each ancestral population for each individual.

ADD REPLY • link 12.0 years ago by Maxime Lamontagne ★ 2.3k

1

Entering edit mode

Yes, we used STRUCTURE for our analysis of a Puerto Rican population.

ADD REPLY • link 12.0 years ago by Larry_Parnell 16k

score 2 · Answer 1 · 2012-04-17

2

Entering edit mode

12.0 years ago

1234Jc4321 ▴ 450

check out this post: http://www.biostars.org/post/show/16715/software-for-inferring-population-structure/#39037

ADD COMMENT • link 12.0 years ago by 1234Jc4321 ▴ 450

score 2 · Answer 2 · 2012-04-17

I'd look at the panel of ancestry informative markers used by Seldin, particularly the last in the list here:

An ancestry informative marker set for determining continental origin: validation and extension using human genome diversity panels. Nassir R, Kosoy R, Tian C, White PA, Butler LM, Silva G, Kittles R, Alarcon-Riquelme ME, Gregersen PK, Belmont JW, De La Vega FM, Seldin MF. BMC Genet. 2009 Jul 24;10:39. PMID: 19630973

Ancestry informative marker sets for determining continental origin and admixture proportions in common populations in America. Kosoy R, Nassir R, Tian C, White PA, Butler LM, Silva G, Kittles R, Alarcon-Riquelme ME, Gregersen PK, Belmont JW, De La Vega FM, Seldin MF. Hum Mutat. 2009 Jan;30(1):69-78. PMID: 18683858

A genomewide single-nucleotide-polymorphism panel for Mexican American admixture mapping. Tian C, Hinds DA, Shigeta R, Adler SG, Lee A, Pahl MV, Silva G, Belmont JW, Hanson RL, Knowler WC, Gregersen PK, Ballinger DG, Seldin MF. Am J Hum Genet. 2007 Jun;80(6):1014-23. PMID: 17557415

European population substructure: clustering of northern and southern populations. Seldin MF, Shigeta R, Villoslada P, Selmi C, Tuomilehto J, Silva G, Belmont JW, Klareskog L, Gregersen PK. PLoS Genet. 2006 Sep 15;2(9):e143. Epub 2006 Jul 25. PMID: 17044734