How to estimate polygenic risk score (PRSs) using the scoring files from PGSCatalog for one individual?
2
3
Entering edit mode
17 months ago

Hi all,

I have an annotated vcf file for one individual which I want to estimate his polygenic risk score (PRS) for a certain trait, using the scoring files from the PGSCatalog. The scoring file contains the SNP ID, reference and alternative allele, and weights. How can I estimate the PRS using the scoring file without using the classical approach of having GWAS summary data, target data, etc...?

Thank you very much for your consideration.

prs genome snp • 2.3k views
ADD COMMENT
1
Entering edit mode
17 months ago
zx8754 11k

PGSCatalog has all you need to calculate the score, if individual has the effect allele then multiply it with beta, do the same for all SNPs, then sum.

ADD COMMENT
0
Entering edit mode

Thank you very much for your answer. One last question. Since the allele represent the single point mutation, should I use dummy coding to transform the nucleotides and then perform the multiplication with the betas or is there another approach?

ADD REPLY
1
Entering edit mode

Yes, if effect allele is "A" and genotype is "A A", then 2 * beta

ADD REPLY
0
Entering edit mode

Does anyone have any software or script that performs these calculations?

ADD REPLY
1
Entering edit mode

It is a one-liner, multiply genotypes with coefficients and sum them

ADD REPLY
0
Entering edit mode

Maybe try PRSice-2: https://www.prsice.info/

ADD REPLY
0
Entering edit mode
12 months ago

Hi, there's a new nextflow module, imputeme, that can do that at NF-core. It's for exactly your use case, and I believe it handles the key things asked here. I disagree that it is a "one-liner" as some comments suggests, for several reasons - a main one being that OP has an annotated vcf file, and vcf files are empty at positions that are not homozygote reference, whereas PGS catalog data does not necessarily have effect allele matched to ref and alt notations. Oh and don't let the name trick you, when inputting whole genome sequence data, no imputation takes place. That's just for the microarray based inputs. Here's the link, it should fit right into any nextflow pipeline: https://github.com/nf-core/modules/tree/master/modules/imputeme/vcftoprs

ADD COMMENT

Login before adding your answer.

Traffic: 2101 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6