Entering edit mode
11.0 years ago
jvijai
▴
10
Aim: Download public data in a range, calculate the frequency of haplotypes in that region for overall and each ethnic population.
I want to download the region around BRCA1 from the 1000genomes data.
tabix -fh http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521/ALL.chr17.phase1_release_v3.20101123.snps_indels_svs.genotypes.vcf.gz 17:41,196,312-41,277,340 >BRCA1_1000g_20101123.vcf
So I have my BRCA1 genotype data and I want to check the frequency just as a QC measure.
vcftools --gzvcf BRCA1_1000g_20101123.vcf.gz \
--freq \
--out BRCA1Copy_1000g_20101123.vcf.freq
Now, I want to now find the common and "all" haplotype blocks and the frequency of haplotypes in this region.
What filters should be applied on allele frequencies .
Any help is very much appreciated.
Can you make an example of the output that you would expect to see? Do you want the frequency of all the possible haplotypes, or only of the ancestral alleles haplotypes?