How to generate haplotypes with SNPs from WGS ?
Hi,

I have a .vcf file with 150 000 SNPs on the X chromosome and genotypes for 5 samples from WGS.

I tried using plink. I exported the data in plink format :

Then I manually modified the .ped file so it take into account my pedigree. 3 males, 2 females. 2 cases, 3 controls. And I run the r2 function of plink.

plink --bfile plink_X --ld-xchr 1 --r2 -ld-window-kb 1 --ld-window 1000 --ld-window-r2 0 --out plink_X

The results are R2=1 everywhere no matter the size of the window.

I also tried the blocks function.

plink --bfile plink_X --ld-xchr 1 --blocks -ld-window-kb 1 --ld-window 1000 --ld-window-r2 0 --out plink_X

But no haploblocks found.

What I want is to know is the shared regions of the X chromosome for the 2 case samples. Does anyone know what I am doing wrong ? Or what else I can try ?

You can't compute meaningful LD stats with only 5 samples. (You also shouldn't be using vcftools to "export to plink"; plink --vcf is far better at that operation.)

A standard approach for estimating human haplotypes is to use a public reference dataset, such as 1000 Genomes, as an additional input to a phasing program like Eagle or SHAPEIT4.

Thank you for your help ! I'll try this !