6.5 years ago

My goal is to do phasing using shapeit2 (I don't have haplotype reference panels) and I have .ped and .map files for some populations in A/C/G/T format. AFAIK the .ped file only contains genotypes, but no reference allele information. Shapeit2 generated 2 files:

shapeit.v2.r644.linux.x86_64 \
    -B population_A.unphased \
    -O population_A.PHASED \
    -T 8
  • .haps with inferred haplotypes in 0/1 format (each line is a SNP)
  • .sample (each line is an individual)

Now I have to read a .ped from other population (related to the same study), and I want to ensure allele coding consistency to maintain inference accuracy.

Is there a way to configure allele encoding in SHAPEIT2?
How do I find which allele was chosen as reference allele in the SHAPEIT2 output?

