Question about Risk Scores
1
2
Entering edit mode
3.4 years ago
lisaks ▴ 30

Hello everybody,

I am very new to bioinformatics and come from a quantitative economics background. I was asked to help with a project on creating a risk score consisting of multiple SNPs and since then I have been doing some research on this.

One way to go as I've researched is to do this with a simple math formula, which would be an unweighted approach (like a TGS).

I am interested in the approach of giving weights to the SNPs as this should make the results more accurate(?). My main question is to ask if I understand the process correctly and if my ideas are possible:

  • I have predetermined SNPs of interest that are associated with a specific trait

  • calculate/get the betas for the chosen SNPs from GWAS

  • use these betas to weight and calculate a risk score for my own sample

Is this plausible?

I've read about PLINK and R-packages such as lassosum, PRsice, LDpred and PRS-CS. I don't fully understand the process of what the best way is to calculate/get the betas from GWAS.

I would be really thankful for any tips and help regarding this. Thanks in advance for taking the time to read and respond to my message :)

SNP PRS risk scores R GWAS • 1.1k views
ADD COMMENT
2
Entering edit mode
3.4 years ago

There is no standard in this area, but you have the general idea correct. I have already seen people do:

  • summing / totalling the beta coefficients
  • multiplying the beta coefficients by some other weight
  • summing and scaling the beta coefficients to be between 0-1, 0-10, or 0-100, etc.

In my own approach in the private sector, years ago, I managed to use a Bayesian logistic regression and 'pre-adjusted' the beta coefficients by supplying conservation scores (log-scale) as priors - conservation score is the single best predictor of pathogenicity / functionality of a genetic variant. If a region is highly conservative, the effect would be to increase the beta coefficient.

Using PRS may not necessarily be any more accurate than just a standard model that includes, e.g., the minor alleles of the SNPs. You could add in [to this model] the computed PRS, which improve accuracy. I am just not sure that any PRS can account for the complexity of how the genome works.

Kevin

ADD COMMENT
1
Entering edit mode

Thanks for your quick and helpful response!

ADD REPLY

Login before adding your answer.

Traffic: 2807 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6