Hi,
Hope someone may be able to point me in the right direction!
I've generated some polygenic risk scores for schizophrenia in a cohort of individuals with borderline intellectual disability. Trying to figure out a way to validate the scores I'm getting out of PRSice. Is there a way to do this?
My current results (with --all-score) look something like this: https://ibb.co/jv9YQP5 (sorry couldn't place the table here in an easily legible format)
There are some patterns in my results, with some individuals having identical scores at particular p-values (is this just due to the stats involved in generating the scores?), and the scores are pretty low (clustering around 0) with some changing from positive to negative depending on p-value threshold. Is this typical of polygenic risk scores?
I've followed Marees et al 2018 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6001694/) and Sam Choi's tutorail (https://choishingwan.github.io/PRS-Tutorial/) to generate this, but I'm thinking I may have made errors somewhere along the line and would like to check my results somehow!
Any advice would be much appreciated!
Mind giving more information about your sample and command you used for PRSice? If you are only giving a few SNPs to PRSice, then it is very likely for you to get zero or close to zero results (because there isn't much information). As for the positive and negative results, it is normal for PRS analysis as your beta / log(OR) are not all positive. By adding those value up, you don't necessary expect to see all positive number.
Hi Sam,
Thanks for your reply- great to get advice from you!
The cohort I'm using has about 80 cases and 150 controls. I removed the controls who were related to cases, and then after other QC, I ended up with 49 cases and 56 controls. Perhaps this is just too small? It's a wholly white (Northern European) sample. I've used the PGC SCZ2 GWAS as the base data (https://www.med.unc.edu/pgc/download-results/scz/)
I've used the following command for PRSice-
I think you are right in that perhaps I haven't given enough SNPs to PRSice. The log file indicates there are 265 variants after clumping. Is this typical?
Thanks again for your help,
Mo
Hello, I was hoping to know more about how to fix the negative beta. I think the negative beta means that the coded allele is reverse. for example: if you have some negative beta in your base GWAS data, and the ref allele is the opposite in your target population, maybe before constructing PRS, you need to fix this problem ? how can we do it. Thank you in advance
Negative beta isn't a problem. you can view effective allele with negative beta as "protective", which is alright.
As Sam states - a negative value would suggest a protective score. I'd just been querying likely validity in scores which move from increasing risk to being protective at differing p-values (for the same individual). I think I just don't have enough data for it to be meaningful.
If you know that your GWAS and target datasets have reference allele in different columns, I think you can specify the headers with the --A1 and --A2