Minor Allele Frequency (Maf) In Dbsnp And 1000 Genome Project
9.0 years ago
michealsmith ▴ 760

I'm using annovar to filter common variants using different database, in order to find rare variants. Then I pay special attention to the allele frequency of alternative alleles.

Here is an example of same SNP from both dbsnp and 1000Genome Project:

From dbsnp:

1    1282270    rs307356    T    C    .    PASS    G5;GCF;GENEINFO=DVL1:1855;GMAF=0.0406764168190128;GNO;KGPilot123;RSPOS=1282270;SAO=0;SLO;SSR=0;VC=SNV;VLD;VP=050100000000050110000101;WGT=0;dbSNPBuildID=79

From 1000G (I would say, annovar format converted from 1000G):

1    1282270    T    C    0.96    rs307356

From this example, I then realized my long-time confusion between "minor allele frequency" and "alternative allele frequency". This seems to me that, reference allele T is minor; while alternative C is major here.

I used to think "alternative allele" is "minor allele", and now seems it's wrong. Then regardless of minor/major, can I infer what is alternative allele frequency from the above vcf files? thx

maf dbsnp • 8.3k views
9.0 years ago
tiagoantao ▴ 670

I am guessing here.

But it seems to me that a SNP on dbSNP it not tied to a specific population study, therefore the concept of minor allele frequency does not make sense. We do not really know, world wide which allele is minor or major. OTOH, in a specific study (e.g. 1000 genomes) it is clear which is which. The maximum that one could expect from dbSNP, in my view, would be to be informed that, from the specific study that found the SNP, which allele is the minor. But I would not expect dbSNP to make the generalization of which allele is MAF.

Absolutely. It all really depends on what you are defining as a population. One populations minor allele is another population's major allele.

Thanks. Then regardless of minor/major, can I infer what is alternative allele frequency from the above vcf files?


