Question: Report haplotype frequencies from a region within a phased vcf file
3.5 years ago by
United States
Krisr460 wrote:


I am interested in obtaining haplotypes (and their frequencies) from a region of the human genome from a particular population from the Phase3 1000 genomes data. I have downloaded the corresponding chromosome ( and used tabix and VCF tools to obtain the genotype data for the region of interest from a subset of of subjects corresponding to CEU. Does anyone know of a workflow to utilize data of this sort to infer haplotypes and their frequencies across the region specified in the subsetted VCF?


Does this prior post help you: Haplotype frequencies from 1000 genomes

Thanks for the reply, it is a solution I may try. I was wondering if there was a workflow using pre-existing tools already coded and available.

Not that I see from a quick search of the net, but you are right to post it on a blog like this cause people have almost certainly done it.

I would: 1) download 1kg phased VCF file 2) trim to just the SNPs I wanted 3) convert into my desired input format for Haploview ( 4) offload haplotype frequency calculation to haploview (easy to code, but this gives you everything else in the GUI).

