Hi all, I would like to query the 1000genomes database to calculate the frequency of mutations in some genes that are NOT listed in dbsnp. How can I do this?
Thanks
Hi all, I would like to query the 1000genomes database to calculate the frequency of mutations in some genes that are NOT listed in dbsnp. How can I do this?
Thanks
you can us the tabix tool (http://www.1000genomes.org/category/tabix ) to download the SNP data from 1000 genomes. Then, you can use vcftools http://vcftools.sourceforge.net/options.html to get the allele frequencies.
The 1000 Genomes project vcf files contain both Global and super population allele frequencies for all its variants in the info column of the release vcf files
The most current release is found here
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase1/analysis_results/integrated_call_sets/
As Giovanni mentioned you can use tabix to get specific subsections of these files and vcftools if you want to see the frequency for a sub population
There is more info about that in our FAQ
http://www.1000genomes.org/faq/can-i-get-genotypes-specific-individualpopulation-your-vcf-files http://www.1000genomes.org/faq/how-can-i-get-allele-frequency-my-variant http://www.1000genomes.org/faq/how-do-i-get-sub-section-vcf-file
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
See RyanD's answer to this question - 1000 genomes LD calculation - it is about LD calculation, but once in plink format you can run --freq to get frequency.