Question: Allele frequencies for ALL of the variants in 1000Genomes: direct download?
1
gravatar for lillo.sim
4.8 years ago by
lillo.sim40
United Kingdom
lillo.sim40 wrote:

Hi,

I am trying to find the minor allele frequency of all of the SNPs and INDELs from the 1000Genomes EUR reference.

It was taking too long to compute the frequencies of ALL the variants using vcftools as outlined here Getting Allele Frequencies From 1000 Genomes, so I tried extracting the allele frequency reported in the vcf files under the "EUR_AF" to get a list for each variant of: chromosome, position, EUR_AF, however I don't think this is the minor allele frequency.

The vcf file reports: 

##INFO=<ID=EUR_AF,Number=1,Type=Float,Description="Allele Frequency for samples from EUR based on AC/AN"

with the AC being the "Alternate Allele Count" and AN the "Total Allele Count".

Would I need to also extract the AC and AN columns (and also maybe the AA, i.e. "ancestral allele"?) to derive which allele the vcf is referring to, or does anyone know of a better way to do this? 

 

Thank you for any suggestions!

sequencing snp • 4.4k views
ADD COMMENTlink modified 17 months ago by Biostar ♦♦ 20 • written 4.8 years ago by lillo.sim40
2
gravatar for Charles Warden
4.8 years ago by
Charles Warden6.5k
Duarte, CA
Charles Warden6.5k wrote:

If the goal is to eventually annotate a set of SNPs and indels, it will probably be easiest to just use a separate tool (like ANNOVAR) to get variant frequences.  In fact, I believe SeattleSNP specifically provides CEU frequencies (if that is close enough), in it's report:

http://snp.gs.washington.edu/SeattleSeqAnnotation138/

I know that you can view this information on a case-by-case basis in the 1000 genomes browser when you select a variant of interest and view the "population genetics" information.  Here is one such example:

http://browser.1000genomes.org/Homo_sapiens/Variation/Population?db=core;g=ENSG00000107949;r=10:127526565-127527814;source=dbSNP;v=rs137962051;vdb=variation;vf=30107459

So, I assume these statistics are tabulated somewhere.  Have you tried contacting 1000 genomes support via info@1000genomes.org?

ADD COMMENTlink written 4.8 years ago by Charles Warden6.5k

Thank you for your reply and suggestion Charles, I have written to the 1000 Genomes to ask for a direct link for the frequencies for all of the variants, I will post their answer here as soon as I get one if it helps someone else. 

ADD REPLYlink written 4.8 years ago by lillo.sim40

hey Lillo - did you hear anything back from 1000 Genomes? Trying to do the same thing here!

ADD REPLYlink written 2.4 years ago by belalc800
1
gravatar for Giovanni M Dall'Olio
4.8 years ago by
London, UK
Giovanni M Dall'Olio26k wrote:

In our lab we calculated the MAF for CEU, CHB and YRI, and put them in our HSB browser. You can download them from here.

This is the link to the main article: Pybus et al, 2014

ADD COMMENTlink written 4.8 years ago by Giovanni M Dall'Olio26k

Thank you Giovanni! This is still a limited set of SNPs (N= 10,836,459), so I will still keep trying to find another way, but in the meantime this helps!

ADD REPLYlink written 4.8 years ago by lillo.sim40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1332 users visited in the last hour