I'm analyzing more than 200K SNPs across hundreds of diverse plant accessions. We got the data from GBS, so the raw SNP dataset is in hapmap format: column1: rs. (SNP name); column2: alleles; column3: chrom; column4: physical position in bp; column5: strand; column6: assembly; column7: center; column8: protLSID; column9: assayLSID; column10: panelLSID; column11: QCcode.
Then the following columns are different genotypes, and genotype data coded with IUPAC codes (A, C, G, T, R, S, Y, W, K, M).
I want to do a population structure analysis first and am planning to use ADMIXTURE software, which requires binary PLINK (.bed), or ordinary PLINK (.ped), or EIGENSTRAT (.geno) input file format.
Does anyone know how to convert GBS hapmap format into PLINK bed/ped format?