Question: Plink Logistic regression controlling for pop. structure in tri-hybrid admixed population
gravatar for Guilherme
3 months ago by
Guilherme20 wrote:


I have done an analysis in STRUCTURE to get individual ancestry estimation in my population sample. My population is a tri-hybrid population with ancestry of European, African and Amerindian. My question is:

Since It's a tri-hybrid admixed population, do I have to include in the logistic regression model the three columns (EUR, AFR, AMR), as covariates in order to control for possible confounding effects from population stratification?

I'm asking this because I have seen people using only two (for example, EUR and AMR, or AFR and AMR)...


ADD COMMENTlink modified 3 months ago by Kevin Blighe19k • written 3 months ago by Guilherme20

Are you interested to do single-marker analyses (GWAS)? Feed data to gemma, it's going to create kinship matrix. Proceed with kinship matrix, your phenotype, covariates. Kinship takes care of population stratification. Gemma is good for admixed population.

ADD REPLYlink modified 3 months ago • written 3 months ago by Bioinformatics_NewComer300
gravatar for Kevin Blighe
3 months ago by
Kevin Blighe19k
University College London Cancer Institute
Kevin Blighe19k wrote:

Ciao Guilherme,

I am not so sure about the choice of STRUCTURE. It's reliability has been refuted: The computer program STRUCTURE does not reliably identify the main genetic clusters within species: simulations and implications for human population structure.

If your population has mixed ethnicity, then population structure, i.e., ethnicity, will most likely be a confounder for which you should adjust in your logistic regression model. You can visually check the extent of the confounding (and statistically check via various metrics) through PCA: Produce PCA bi-plot for 1000 Genomes Phase III in VCF format

The standard way to adjust for population structure is to include the sample eigenvalues from PC1 and PC2 as covariates (or more PCs, if you feel necessary). PCA can easily be performed within PLINK or using other programs, such as EIGENSOFT or GCTA (the actual PCA implementation in PLINK is the same as GCTA).

There is a previous Biostars thread relating to a similar topic: GWAS: when is it appropriate to add covariates?



ADD COMMENTlink written 3 months ago by Kevin Blighe19k

Thank you so much for the enlightenment Kevin! I'm going to take a look at PCA.

ADD REPLYlink written 3 months ago by Guilherme20

De nada cara

ADD REPLYlink written 3 months ago by Kevin Blighe19k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1978 users visited in the last hour