Question

Population Stratification In A Cohort Study And Analysis Of A Continuous Outcome Variable

2

Entering edit mode

12.1 years ago

Cmp ▴ 20

Hi, I recently started working in the field of genetic epidemiology, so many concepts are still quite new for me. My current research is part of a cohort study where my independent variable is a certain polymorphism and my outcome is a continuous variable. My study population comprises almost exclusively subjects from European origin. However, there are a few participants from other origins (e.g. Africans, Asians).

I've been reading about the potential risk of bias due to population stratification, but I've only found examples from case-control studies with binary outcomes. I need to decide whether the exclusion of these few participants of non-European origin is really important in my study or not. Therefore, does anyone know (or has any reference about) how this risk of bias applies to a cohort study with continuous outcomes?

Thank you

population • 3.5k views

ADD COMMENT • link updated 12.1 years ago by Larry_Parnell 16k • written 12.1 years ago by Cmp ▴ 20

score 1 · Answer 1 · 2012-03-15

Hi CMP,

My understanding is that the bias can exit irrespective of whether your outcome variable is continuous/binary etc.

Stratification can be assessed easily enough using Eigenstrat which will compute principal components on your genotype data, thus allowing you to detect stratifications or groupings in your data. You can then correct for stratification by including the relevant number of components as covariates in your analysis.

By covariate, I mean that you can include them as dependent variables in your model, i.e.,

Phenoytpe ~ Genotype + Component_1 + Component_2 + ..... + Component_i

Darren.

score 0 · Answer 2 · 2012-03-15

0

Entering edit mode

12.1 years ago

Larry_Parnell 16k

We used STRUCTURE to determine an ancestry index in Puerto Ricans (3 ancestral populations). Then, performed the association test using the ancestry index as covariate. PCA is a second, good method to determine ancestry component, after which you would use the eigenvalue as a covariate in the association test. The variables we tested were both continuous and categorical.

We found that in using STRUCTURE it is necessary to have good data for the reference populations in order to get the best predictions of the ancestry index.

ADD COMMENT • link 12.1 years ago by Larry_Parnell 16k

0

Entering edit mode

Thank you Darren and Larry. I read your references. I understand the theory behind PCA to assess ancestry, and it's nice to see I can use it with both continuous and categorical variables.

However, I think this will not be an option for my candidate gene study, as I only have data on a few SNPs in 1 gene (I understood I need many more SNPs to correctly assess stratification). Would it be an alternative to use information on the country of birth of parents, grandparents and great-grandparents and exclude those participants with a clear non-European ancestry?

ADD REPLY • link 12.1 years ago by Cmp ▴ 20

0

Entering edit mode

If sufficient data are available for country of birth for grandparents, I think you can use that because I have heard of this strategy begin employed. We used a minimum set of markers for the ancestry index determination in the Puerto Ricans and that was a specific set of 100 SNPs.

ADD REPLY • link 12.1 years ago by Larry_Parnell 16k