Question: Convert GBS hapmap file into Plink bed file
Dear all,

I'm analyzing more than 200K SNPs across hundreds of diverse plant accessions. We got the data from GBS, so the raw SNP dataset is in hapmap format: column1: rs. (SNP name); column2: alleles; column3: chrom; column4: physical position in bp; column5: strand; column6: assembly; column7: center; column8: protLSID; column9: assayLSID; column10: panelLSID; column11: QCcode.

Then the following columns are different genotypes, and genotype data coded with IUPAC codes (A, C, G, T, R, S, Y, W, K, M).

I want to do a population structure analysis first and am planning to use ADMIXTURE software, which requires binary PLINK (.bed), or ordinary PLINK (.ped), or EIGENSTRAT (.geno) input file format.

Does anyone know how to convert GBS hapmap format into PLINK bed/ped format?


OK, in case anyone has similar questions. I figured it out.

  1. Convert GBS hapmap file to VCF; I did it in tassel: ./ -fork1 -h my_hapmap.hmp.txt -export -exportType VCF -runfork1

  2. Convert vcf file to plink format: vcftools --vcf input_data.vcf --plink --out out_in_plink

