Haplotype Frequency Calculator For Vcf Files
0
2
Entering edit mode
10.3 years ago
diviya.smith ▴ 60

Aim: Download public data in a range, calculate the haplotype frequency for SNPs in the region for each ethnic population.

I want to compute haplotype frequencies for several markers in a region for each 1000 genomes ethnic population. I was wondering if there is any tool like vcftools or other that can be used for this purpose. Specifically, for a set of regions, I want to find the genotypes for all markers in the region and compute the haplotype frequencies.

Right now, I am manually doing this by extracting each region from the vcf file using tabix-

    tabix -h ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20100804/ALL.2of4intersection.20100804.genotypes.vcf.gz 17:1471000-1472000 | vcf-subset -c CEU.list >CEU_region1.vcf

And then computing the frequency of every haplotype for each pair of SNPs in the region. The regions I am considering are small and typically contain only 2-3 SNPs and most of 1000 genomes data is phased and so this is not too computationally expensive but a little cumbersome.

Can anyone suggest a better solution to this problem?

haplotype vcf • 4.4k views
ADD COMMENT
0
Entering edit mode

Hi Diviya,

I am having the exact same question, so if by any chance you managed to find a good solution it would be great if you could let me know!

Best,
Anne

ADD REPLY

Login before adding your answer.

Traffic: 2488 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6