Hi all,

I have a quick question. In a seminar talk by Cathryn Lewis (professor of Genetic Epidemiology and Statistics at King's College London), she said that when you run a regression using Polygenic Risk Score (PRS) as your Independent Variable (IV) predicting your phenotype of interest (Dependent Variable - DV), you can add covariates to control for. These are normally principal components, genotyping batch, anything that it is correlated with your PRS, as opposed to covariates, such as age and IQ, that are correlated with the phenotype (i.e., DV) instead.

I was looking for a paper to check that I actually understood properly and to use as a reference instead of the seminar talk, but I couldn't find any. So I was wondering if you could confirm that it is common practice to correct for covariates associated with your genetic IV only, and/or if you have a paper to recommend.

Thanks so much, Silvia

From what I understood, these (i.e., the PC covariates, age, BMI, etc) are more typically included when deriving the polygenic risk scores (PRS) themselves; so, the model would be:

The PRS is then typically constructed from the beta coefficient for the SNP from this model. I see no further need to adjust for covariates when using these scores elsewhere, due to the fact that the covariates are already 'absorbed' into the scores.

Examples:

It all depends on how the PRS was calculated in the first place. Remember that 'risk score' is a general term with no clear definition. If you ask two Professors of Statistics '

What are risk scores?', they will give different answers.