I have been trying to develop a GRS.
All chromosomal files from the UK biobank were joined to generate a single merged.bed file. The filter was as follows:
--maf 0.01 \ --hwe 1e-6 \ --geno 0.1 \
PLINK v1.90p 64-bit (8 Nov 2021) Options in effect: --bed chr_merged.bed --bim chr_merged.bim --clump park_updated.score --clump-field P --clump-kb 250 --clump-p1 1 --clump-r2 0.1 --clump-snp-field SNP --fam chr1.fam --out chr.qc --threads 64 1031886 MB RAM detected; reserving 515943 MB for main workspace. 4113097 variants loaded from .bim file. 487409 people (223038 males, 264368 females, 3 ambiguous) loaded from .fam. Ambiguous sex IDs written to chr.qc.nosex . Using 1 thread (no multithreaded calculations invoked). Before main variant filters, 487409 founders and 0 nonfounders present. Calculating allele frequencies... done. Total genotyping rate is 0.99122. 4113097 variants and 487409 people pass filters and QC. Note: No phenotypes present.
I got the following message:
Warning: 'rs356203' is missing from the main dataset, and is a top variant. Warning: 'rs356219' is missing from the main dataset, and is a top variant. Warning: 'rs356215' is missing from the main dataset, and is a top variant. 2357669 more top variant IDs missing; see log file.
I have written on the plink forum, and was informed that my SNP are not in sync, I am not understanding what I have done wrong here.