so I have SNPs (RSIDs) from imputation done in 2011 on http://csg.sph.umich.edu/abecasis/MACH/tour/ (call it 2011 data) and I did imputation on the same genotype files on Michigan Imputation Server, Genotype Imputation (Minimac4) 1.2.4 (call it 2020 data)
using the same QC steps I perfomed GWAS using plink.
In 2011 I have ~2.5 million SNPs and in 2020 I have ~2.7 million SNPs. The issue is that only ~900000 SNPs are matching between those two data sets. Can someone please explain me why? Did RS names changed in the meantime? I did put both genotype files on Build 37. Here I am presenting number of SNPs per chromosome for old (2011) data and new (2020) data. Also I am comparing snps_that_can_be_found_in_old_but_not_in_new and snps_that_can_be_found_in_new_but_not_in_old.
Can someone please explain me what might be the issue and why there is only ~900000 SNPs matching SNPs?