Question: Question about Risk Scores
gravatar for lisaks
9 weeks ago by
lisaks20 wrote:

Hello everybody,

I am very new to bioinformatics and come from a quantitative economics background. I was asked to help with a project on creating a risk score consisting of multiple SNPs and since then I have been doing some research on this.

One way to go as I've researched is to do this with a simple math formula, which would be an unweighted approach (like a TGS).

I am interested in the approach of giving weights to the SNPs as this should make the results more accurate(?). My main question is to ask if I understand the process correctly and if my ideas are possible:

  • I have predetermined SNPs of interest that are associated with a specific trait

  • calculate/get the betas for the chosen SNPs from GWAS

  • use these betas to weight and calculate a risk score for my own sample

Is this plausible?

I've read about PLINK and R-packages such as lassosum, PRsice, LDpred and PRS-CS. I don't fully understand the process of what the best way is to calculate/get the betas from GWAS.

I would be really thankful for any tips and help regarding this. Thanks in advance for taking the time to read and respond to my message :)

gwas risk scores snp prs R • 158 views
ADD COMMENTlink modified 9 weeks ago by Kevin Blighe69k • written 9 weeks ago by lisaks20
gravatar for Kevin Blighe
9 weeks ago by
Kevin Blighe69k
Republic of Ireland
Kevin Blighe69k wrote:

There is no standard in this area, but you have the general idea correct. I have already seen people do:

  • summing / totalling the beta coefficients
  • multiplying the beta coefficients by some other weight
  • summing and scaling the beta coefficients to be between 0-1, 0-10, or 0-100, etc.

In my own approach in the private sector, years ago, I managed to use a Bayesian logistic regression and 'pre-adjusted' the beta coefficients by supplying conservation scores (log-scale) as priors - conservation score is the single best predictor of pathogenicity / functionality of a genetic variant. If a region is highly conservative, the effect would be to increase the beta coefficient.

Using PRS may not necessarily be any more accurate than just a standard model that includes, e.g., the minor alleles of the SNPs. You could add in [to this model] the computed PRS, which improve accuracy. I am just not sure that any PRS can account for the complexity of how the genome works.


ADD COMMENTlink modified 9 weeks ago • written 9 weeks ago by Kevin Blighe69k

Thanks for your quick and helpful response!

ADD REPLYlink written 8 weeks ago by lisaks20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1865 users visited in the last hour