I've imputed genotype data using the Michigan Imputation Server (MIS), using the 1000 Genomes Phase 1 panel (not many errors were found, according to MIS). After (and before) imputation, I wanted to perform a sanity check by running checkVCF.py [https://github.com/zhanxw/checkVCF], to make sure the ref/alt alleles in my data were consistent with 1000G Phase1 data. This analysis revealed several inconsistent reference sites, when comparing to this fasta file from 1000G Phase1. Upon close inspection, I noticed that the reference alleles for several SNPs which were "supposedly inconsistent" in my vcf were actually consistent with the data in UCSC Browser, suggesting me that I was using the wrong fasta file as reference for checkVCF.py. I saw that this person also had a similar issue, but I could not find an answer regarding which fasta file I should use as reference for checkVCF.py (or for other tools, like "bcftools norm --check-ref").
I found this link which says that 1000 Genomes doesn't provide fasta files containing variant information, so the file I used as reference for checkVCF.py was not right in the first place. Any clue anyone?