Allele frequencies for ALL of the variants in 1000Genomes: direct download?
2
1
Entering edit mode
9.9 years ago
lillo.sim ▴ 50

Hi,

I am trying to find the minor allele frequency of all of the SNPs and INDELs from the 1000Genomes EUR reference.

It was taking too long to compute the frequencies of ALL the variants using vcftools as outlined here, so I tried extracting the allele frequency reported in the vcf files under the EUR_AF to get a list for each variant of: chromosome, position, EUR_AF, however I don't think this is the minor allele frequency.

The vcf file reports:

##INFO=<ID=EUR_AF,Number=1,Type=Float,Description="Allele Frequency for samples from EUR based on AC/AN"

with the AC being the "Alternate Allele Count" and AN the "Total Allele Count".

Would I need to also extract the AC and AN columns (and also maybe the AA, i.e. "ancestral allele"?) to derive which allele the vcf is referring to, or does anyone know of a better way to do this?

Thank you for any suggestions!

sequencing SNP • 6.6k views
ADD COMMENT
2
Entering edit mode
9.9 years ago

If the goal is to eventually annotate a set of SNPs and indels, it will probably be easiest to just use a separate tool (like ANNOVAR) to get variant frequences. In fact, I believe SeattleSNP specifically provides CEU frequencies (if that is close enough), in it's report:

http://snp.gs.washington.edu/SeattleSeqAnnotation138/

I know that you can view this information on a case-by-case basis in the 1000 genomes browser when you select a variant of interest and view the "population genetics" information. Here is one such example:

http://browser.1000genomes.org/Homo_sapiens/Variation/Population?db=core;g=ENSG00000107949;r=10:127526565-127527814;source=dbSNP;v=rs137962051;vdb=variation;vf=30107459

So, I assume these statistics are tabulated somewhere. Have you tried contacting 1000 genomes support via info@1000genomes.org?

ADD COMMENT
0
Entering edit mode

Thank you for your reply and suggestion Charles, I have written to the 1000 Genomes to ask for a direct link for the frequencies for all of the variants, I will post their answer here as soon as I get one if it helps someone else.

ADD REPLY
0
Entering edit mode

hey Lillo - did you hear anything back from 1000 Genomes? Trying to do the same thing here!

ADD REPLY
0
Entering edit mode

Hey, 6 years later have you heard any feedback from 1000 Genomes? Thanks!

ADD REPLY
1
Entering edit mode
9.9 years ago

In our lab we calculated the MAF for CEU, CHB and YRI, and put them in our HSB browser. You can download them from here.

This is the link to the main article: Pybus et al, 2014

ADD COMMENT
1
Entering edit mode

Thank you Giovanni! This is still a limited set of SNPs (N= 10,836,459), so I will still keep trying to find another way, but in the meantime this helps!

ADD REPLY

Login before adding your answer.

Traffic: 2190 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6