Question: Allele frequencies for ALL of the variants in 1000Genomes: direct download?
1
gravatar for lillo.sim
5.9 years ago by
lillo.sim50
United Kingdom
lillo.sim50 wrote:

Hi,

I am trying to find the minor allele frequency of all of the SNPs and INDELs from the 1000Genomes EUR reference.

It was taking too long to compute the frequencies of ALL the variants using vcftools as outlined here Getting Allele Frequencies From 1000 Genomes, so I tried extracting the allele frequency reported in the vcf files under the "EUR_AF" to get a list for each variant of: chromosome, position, EUR_AF, however I don't think this is the minor allele frequency.

The vcf file reports: 

##INFO=<ID=EUR_AF,Number=1,Type=Float,Description="Allele Frequency for samples from EUR based on AC/AN"

with the AC being the "Alternate Allele Count" and AN the "Total Allele Count".

Would I need to also extract the AC and AN columns (and also maybe the AA, i.e. "ancestral allele"?) to derive which allele the vcf is referring to, or does anyone know of a better way to do this? 

 

Thank you for any suggestions!

sequencing snp • 5.0k views
ADD COMMENTlink modified 2.4 years ago by Biostar ♦♦ 20 • written 5.9 years ago by lillo.sim50
2
gravatar for Charles Warden
5.9 years ago by
Charles Warden7.6k
Duarte, CA
Charles Warden7.6k wrote:

If the goal is to eventually annotate a set of SNPs and indels, it will probably be easiest to just use a separate tool (like ANNOVAR) to get variant frequences. In fact, I believe SeattleSNP specifically provides CEU frequencies (if that is close enough), in it's report:

http://snp.gs.washington.edu/SeattleSeqAnnotation138/

I know that you can view this information on a case-by-case basis in the 1000 genomes browser when you select a variant of interest and view the "population genetics" information. Here is one such example:

http://browser.1000genomes.org/Homo_sapiens/Variation/Population?db=core;g=ENSG00000107949;r=10:127526565-127527814;source=dbSNP;v=rs137962051;vdb=variation;vf=30107459

So, I assume these statistics are tabulated somewhere. Have you tried contacting 1000 genomes support via info@1000genomes.org?

ADD COMMENTlink modified 12 weeks ago by RamRS26k • written 5.9 years ago by Charles Warden7.6k

Thank you for your reply and suggestion Charles, I have written to the 1000 Genomes to ask for a direct link for the frequencies for all of the variants, I will post their answer here as soon as I get one if it helps someone else.

ADD REPLYlink modified 12 weeks ago by RamRS26k • written 5.9 years ago by lillo.sim50

hey Lillo - did you hear anything back from 1000 Genomes? Trying to do the same thing here!

ADD REPLYlink written 3.4 years ago by belalc800

Hey, 6 years later have you heard any feedback from 1000 Genomes? Thanks!

ADD REPLYlink written 25 days ago by andy.wang40
1
gravatar for Giovanni M Dall'Olio
5.9 years ago by
London, UK
Giovanni M Dall'Olio26k wrote:

In our lab we calculated the MAF for CEU, CHB and YRI, and put them in our HSB browser. You can download them from here.

This is the link to the main article: Pybus et al, 2014

ADD COMMENTlink written 5.9 years ago by Giovanni M Dall'Olio26k
1

Thank you Giovanni! This is still a limited set of SNPs (N= 10,836,459), so I will still keep trying to find another way, but in the meantime this helps!

ADD REPLYlink modified 12 weeks ago by RamRS26k • written 5.9 years ago by lillo.sim50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1804 users visited in the last hour