Proper way of calculating minor allele frequencies (via vcftools)
Entering edit mode
5.4 years ago
Tim • 0

Hi Biostars,

I would like to learn a proper way of calculating minor allele frequencies , when doing population genetics. With a large sample (more than 100) of whole-genome sequences in VCF formats, I use vcftools. I use optional arguments (--maf and --max-maf) to limit/filter SNPs with a particular range of MAFs. For example, I do not want MAFs less than 0.05, so I use this command:

vcftools --vcf xxx.vcf --out SNP --remove-indels --maf 0.05 --012

However, this would result in a genotype file (012 format) with MAFs ranging from 0.05 to 1, since there may be site locations (genome positions) that mostly match alternatives (compared to ref genome). I thought that MAFs should range between 0 and 0.5. For example, vcftools computes like this:

0 0 0 0 0 0 1 0 0 0 maf = 1/20 = 0.05
2 1 2 2 2 2 2 2 2 2 maf = 19/20 = 0.95

Shouldn't the second row be considered to have a MAF of 0.05? Then, should I simply provide optional arguments to vcftools --maf 0.05 and --max-maf 0.95? Or is there a better way to do this?  



sequencing SNP vcftools • 6.5k views
Entering edit mode
5.4 years ago
trausch ★ 1.6k

Part of vcftools is fill-an-ac

The variant allele frequency (AF) you can afterwards calculate as AC/AN (both new INFO fields from fill-an-ac).

Minor allele frequencies indeed range from 0 - 0.5 and you can then derive the minor allele from the INFO:AF value.

Entering edit mode

Thanks for info. Just for my understanding, if there are two possible alleles for a position, this would be identical to calling --maf 0.05 and --max-maf 0.95 when using vcftools? Also, in this case, we could simply transform/scale MAF (as outputted from vcftools) originally ranged between 0.5 and 1 to between 0 and 0.5?


Login before adding your answer.

Traffic: 2020 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6