Hi! I have a PGS file with the weights for each of the variants for a specific disease, CAD. The goal is to now calculate PRS scores for individuals in the UK Biobank genetic data. I have the plink bed, bim and fam files from the UK biobank data. What would be the steps to prepare the "Target data" from the uk biobank data?
I understand that using the PGS file as the "base data" requires adding a "fake" p-value column. I have tried using the bed, bim and fam files from uk biobank along with a GWAS file as the base data to which it gives: "Error: All sample has invalid phenotypes!", "Errorr: No sample left" . What am I missing here?
> PRSice 2.3.5 (2021-04-06) https://github.com/choishingwan/PRSice (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly GNU General Public License v3 If you use PRSice in any published work, please cite: Choi SW, O'Reilly PF. PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. GigaScience 8, no. 7 (July 1, 2019) 2021-06-02 18:41:16 ./bin/PRSice \ --a1 a1 \ --a2 a2 \ --bar-levels 0.001,0.05,0.1,0.2,0.3,0.4,0.5,1 \ --base CAD_UKBIOBANK.gz \ --beta \ --binary-target F \ --bp bp \ --chr chr \ --extract PRSice.valid \ --interval 5e-05 \ --lower 5e-08 \ --no-clump \ --num-auto 22 \ --out PRSice \ --pvalue pval \ --seed 1999345467 \ --snp oldID \ --stat beta \ --target chr# \ --thread 4 \ --upper 0.5 Initializing Genotype file: chr# (bed) Start processing CAD_UKBIOBANK ================================================== SNP extraction/exclusion list contains 5 columns, will assume first column contains the SNP ID Base file: /shared/Jenish/CAD_UKBIOBANK.gz GZ file detected. Header of file is: uniqid chr bp a1 a2 beta se pval N af oldID info zval Reading 100.00% 7947837 variant(s) observed in base file, with: 1202749 variant(s) excluded based on user input 6745088 total variant(s) included from base file Loading Genotype info from target ================================================== 488377 people (223459 male(s), 264780 female(s)) observed 488377 founder(s) included 181798 variant(s) not found in previous data 602458 variant(s) included There are a total of 1 phenotype to process Processing the 1 th phenotype Error: All sample has invalid phenotypes! Error: No sample left Error: Execution halted enter code here