Hello everyone! I'm a beginner in bioinformatics, but I would like to know which program (s) I should use to building a haplotype family using vcf files and if there is some good program to visualize the results. I really appreciate your help! Carlos
First, convert your VCF data into PLINK format (look up the PLINK documentation). If you have a single VCF of all samples, then I recommend splitting it into the different groups within each of which you want to identify haploblocks, and then creating separate datasets for these.
Within PLINK, you can do things like identifying 'tag' SNPs and calculate linkage disequilibrium, amongst other things. However, I recommend exporting your data from PLINK for input to HaploView with the following command:
plink --noweb --bfile MyPlinkDataset --chr 5 --from-bp 1000 --to-bp 100000 --snps-only no-DI --recodeHV --out MyPlinkDataset.Haploview ;
Within Haploview, you can run the popular program 'Tagger' (identifies tag SNPs), and also identify Haploblocks and their individual types (sequences that make up the haploblocks):