Question

Polygenic Risk Score analysis

0

Entering edit mode

5.7 years ago

pedro.raposo3 ▴ 20

Hi,

I'm very new in the GWAS and PRS analsysis, so my question is simple but I cannot find a straightforward answer anywhere: Is it possible, with a comprehensive database of risk scores associated with traits, to calculate polygenic risk scores for a specific genome? By this, I mean if I can "diagnose" a genome for all diseases already studied by previous genome wide studies.

I'd assume it's not that simple - if that was the case, every paper would reference it.

Thank you for your time

PRS GWAS • 2.0k views

ADD COMMENT • link updated 2.1 years ago by Ram 45k • written 5.7 years ago by pedro.raposo3 ▴ 20

score 3 · Accepted Answer · 2019-10-18

3

Entering edit mode

5.7 years ago

Kevin Blighe 89k

Technically, it is possible, by one doing the following:

constructing predictive models using all statistically significant GWAS hits for each condition / phenotype
cross-validating and refining the models on training and testing data
making model predictions on new data

Some extra points to consider:

statistically significant GWAS hits may not necessarily result in disease or confer a particular phenotype; instead they may only increase / decrease risk (that is, to say, that many of these variants have incomplete penetrance)
getting samples to do this work will be difficult
'polygenic risk score' is a generic term and there are many ways to construct these. Most are built from the beta coefficient from the regression model fit
you should consider how you are going to build and fit the model. Perhaps something along the lines of elastic-net or ridge regression would be a start. Others have use lasso-penalised regression, in the past, to do something similar for breast cancer somatic variants.

Note, that, replacing 'predictive models' with 'AI' or 'machine learning algorithm' will likely increase your chance of funding for the work, if that is ultimately what you want.

Kevin

ADD COMMENT • link 5.7 years ago by Kevin Blighe 89k

0

Entering edit mode

So, it's not as straightforward as simply calculating the PRS based on our genome's mapped SNPs, I see.

Thank you for the comprehensive answer!

ADD REPLY • link 5.7 years ago by pedro.raposo3 ▴ 20

1

Entering edit mode

Ah, if you want a more automated way to do it, then I would recommend taking a look at PRsice by Sam

ADD REPLY • link 5.7 years ago by Kevin Blighe 89k

1

Entering edit mode

Thank you for both of your answers

ADD REPLY • link 5.7 years ago by pedro.raposo3 ▴ 20

2

Entering edit mode

You can also look into our tutorial. However, I guess what you are asking is slightly different, in that you already got PRS associated with disease and you've got a new genome that you want to calculate the Score on. For that, you'll need to know what SNPs were used for the construction and what the weights (this are usually beta-coefficient from GWAS, either used as is (e.g. PRSice), or regularized / shrinked (e.g. LDpred, lassosum, PRS-CS etc). Once you've both information, you'll be able to re-calculate the score.

ADD REPLY • link 5.7 years ago by Sam ★ 4.8k

0

Entering edit mode

Thank you Sam. So, it seems that I can achieve that with GWAS catalog since a collection of different GWAS are present, and most of the SNPs have a beta-coefficient associated with them.

ADD REPLY • link 5.7 years ago by pedro.raposo3 ▴ 20

1

Entering edit mode

Yes, you can, but beware that using only the significant SNPs tends to generate underpowered PRS and if the study of interest use SNPs that are outside of the genome wide significance threshold, then it is likely that you won't have the information required to regenerate the score

ADD REPLY • link 5.7 years ago by Sam ★ 4.8k