Population Stratification In A Cohort Study And Analysis Of A Continuous Outcome Variable
2
2
Entering edit mode
12.1 years ago
Cmp ▴ 20

Hi, I recently started working in the field of genetic epidemiology, so many concepts are still quite new for me. My current research is part of a cohort study where my independent variable is a certain polymorphism and my outcome is a continuous variable. My study population comprises almost exclusively subjects from European origin. However, there are a few participants from other origins (e.g. Africans, Asians).

I've been reading about the potential risk of bias due to population stratification, but I've only found examples from case-control studies with binary outcomes. I need to decide whether the exclusion of these few participants of non-European origin is really important in my study or not. Therefore, does anyone know (or has any reference about) how this risk of bias applies to a cohort study with continuous outcomes?

Thank you

population • 3.5k views
ADD COMMENT
1
Entering edit mode
12.1 years ago

Hi CMP,

My understanding is that the bias can exit irrespective of whether your outcome variable is continuous/binary etc.

Stratification can be assessed easily enough using Eigenstrat which will compute principal components on your genotype data, thus allowing you to detect stratifications or groupings in your data. You can then correct for stratification by including the relevant number of components as covariates in your analysis.

By covariate, I mean that you can include them as dependent variables in your model, i.e.,

Phenoytpe ~ Genotype + Component_1 + Component_2 + ..... + Component_i

Darren.

ADD COMMENT
0
Entering edit mode
12.1 years ago

We used STRUCTURE to determine an ancestry index in Puerto Ricans (3 ancestral populations). Then, performed the association test using the ancestry index as covariate. PCA is a second, good method to determine ancestry component, after which you would use the eigenvalue as a covariate in the association test. The variables we tested were both continuous and categorical.

We found that in using STRUCTURE it is necessary to have good data for the reference populations in order to get the best predictions of the ancestry index.

ADD COMMENT
0
Entering edit mode

Thank you Darren and Larry. I read your references. I understand the theory behind PCA to assess ancestry, and it's nice to see I can use it with both continuous and categorical variables.

However, I think this will not be an option for my candidate gene study, as I only have data on a few SNPs in 1 gene (I understood I need many more SNPs to correctly assess stratification). Would it be an alternative to use information on the country of birth of parents, grandparents and great-grandparents and exclude those participants with a clear non-European ancestry?

ADD REPLY
0
Entering edit mode

If sufficient data are available for country of birth for grandparents, I think you can use that because I have heard of this strategy begin employed. We used a minimum set of markers for the ancestry index determination in the Puerto Ricans and that was a specific set of 100 SNPs.

ADD REPLY

Login before adding your answer.

Traffic: 2263 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6