I am trying to convert the 1000G genotypes into plink format so I can try to run a PCA.
I used Plink 1.9 to recode all the vcf.gz to binary bed files. Now I am using --merge-list to merge each chromosome together into one file. I am curious if I should be worried about the warnings about multiple positions for variants. If that is an issue why was it not mentioned in the vcf to plink conversion, and how does a rsID have more than one position unless they meant more than one base pair like it was a structural variant? The multiple chromosomes seen I am not so sure what that means unless it is an error?
Also I assume I also merge my case population with the 1kG dataset then prune them by LD. After that I can use plink to make a MDS plot or use GCTA?
I guess those SNPs in 1kG are multi-allelics?