Emphasizing biological effect of candidate variants used in polygenic risk score
1
0
Entering edit mode
5 weeks ago
K Lee ▴ 20

Hello, I have tried followed procedure for polygenic risk score analysis with whole genome sequencing data and just want to be sure that there are any problem.

  1. To find out the significant variants which might cause functional impact, I've only used the functional variants which alter protein seqeunce(Exonic) or variants known to reduce/increase gene expression(intronic).

  2. I tried p-value thresholding for refining PRS candidates and 5e-3 of p-value cutoff showed the best performance for PRS calculation.

  3. The variants used for calculating PRS seems to highly enriched to certain pathway, such as Hypoxia, so that i want to emphasizing those 'significant variants used in PRS calculation' are related to hypoxia.

The point what I worry about is below.

  1. I know there is many methods for emphasizing functional variants such as prioritizing but I just chose to exclude non-functional variants. Does it seem acceptable?

  2. Can I say that the variants, used for calculating PRS satisfying 5e-3 of p-value cutoff, are 'significant variants' without any p-value correction? Or, does it seem to make sense to say that 'hypoxia is related to the case phenotype' by utilizing enrichment test result of PRS candidate?

I would appreciate any comments or advice. Thank you in advance.

PRS GWAS polygenicriskscore • 483 views
ADD COMMENT
1
Entering edit mode
5 weeks ago
LChart 4.6k

My first comment is with respect to your procedure. Is it true that you are only including genic variants (exonic + intronic) - or are you actually including intergenic variants that modulate gene expression (eQTLs)? The more structured way to do this is with colocalization analysis -- since you've got the data for PRS (GWAS summary data, at least) then you can combine these with eQTL study data (GTEx at a minimum) to perform colocalization directly. If your GWAS data is publicly available, then you can probably just look this up in ColocDB (https://ngdc.cncb.ac.cn/databasecommons/database/id/9324).

With regards to your specific question:

(1) You can "exclude non-functional variants" from the PRS. If it significantly improves your best PRS correlation then it would seem to be working - you just will need to validate with some held-out study.

(2) You haven't articulated precisely how you computed enrichment here. The appropriate way would probably be to use LD score regression, and partition regions into "functional-hypoxic" (a window around SNPs you determined to be functional and related to hypoxia genes) and "functional-nonhypoxic" (the same, but not hypoxia genes/eGenes), and retain the standard LDSC background.

ADD COMMENT
0
Entering edit mode

Thank you for your helpful comment. I actually calculated the enrichment score by using genes containing those functional variants (remained after LD clumping).

ADD REPLY
0
Entering edit mode

Really that's not detailed enough. You need to specified whether you ran GSEA or GSORA; and if GSEA you need to specify the gene universe and score, and for GSORA you need to specify what the background gene set is.

Where SNPs are involved, LDSC is pretty much the way to go. The older approach (MAGMA) can work in a pinch too; these tend to be much more conservative than enrichment statistics.

ADD REPLY
0
Entering edit mode

Thank you again for your comment and sorry for bothering due to my lack of knowledge. (I'm quite new to gwas analysis and have no one for asking advice arround) I just put the geneset of PRS candidate to enrichr for enrichment test and saw that the hypoxia pathway is highly enriched. Therefore, i wanted to say that since the most of significant variants (which satisfied the p value thereshold at least) are engaged in hypoxia, hypoxia may be associated with the phenotype. Any advice or comment would be appreciated. Thank you so much.

ADD REPLY

Login before adding your answer.

Traffic: 1566 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6