Question: Haplotype Frequency Calculator For Vcf Files
2
gravatar for diviya.smith
6.0 years ago by
diviya.smith50
United States
diviya.smith50 wrote:

Aim: Download public data in a range, calculate the haplotype frequency for SNPs in the region for each ethnic population.

I want to compute haplotype frequencies for several markers in a region for each 1000 genomes ethnic population. I was wondering if there is any tool like vcftools or other that can be used for this purpose. Specifically, for a set of regions, I want to find the genotypes for all markers in the region and compute the haplotype frequencies.

Right now, I am manually doing this by extracting each region from the vcf file using tabix-

    tabix -h ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20100804/ALL.2of4intersection.20100804.genotypes.vcf.gz 17:1471000-1472000 | vcf-subset -c CEU.list >CEU_region1.vcf

And then computing the frequency of every haplotype for each pair of SNPs in the region. The regions I am considering are small and typically contain only 2-3 SNPs and most of 1000 genomes data is phased and so this is not too computationally expensive but a little cumbersome.

Can anyone suggest a better solution to this problem?

vcf haplotype • 3.2k views
ADD COMMENTlink modified 5.0 years ago by Anne0 • written 6.0 years ago by diviya.smith50
0
gravatar for Anne
5.0 years ago by
Anne0
United States
Anne0 wrote:

Hi Diviya,

I am having the exact same question, so if by any chance you managed to find a good solution it would be great if you could let me know!

Best,

Anne

 

 

ADD COMMENTlink written 5.0 years ago by Anne0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1461 users visited in the last hour