Question: Proper way of calculating minor allele frequencies (via vcftools)
gravatar for Tim
3.8 years ago by
Tim0 wrote:

Hi Biostars,

I would like to learn a proper way of calculating minor allele frequencies , when doing population genetics. With a large sample (more than 100) of whole-genome sequences in VCF formats, I use vcftools. I use optional arguments (--maf and --max-maf) to limit/filter SNPs with a particular range of MAFs. For example, I do not want MAFs less than 0.05, so I use this command:

vcftools --vcf xxx.vcf --out SNP --remove-indels --maf 0.05 --012

However, this would result in a genotype file (012 format) with MAFs ranging from 0.05 to 1, since there may be site locations (genome positions) that mostly match alternatives (compared to ref genome). I thought that MAFs should range between 0 and 0.5. For example, vcftools computes like this:

0 0 0 0 0 0 1 0 0 0 maf = 1/20 = 0.05
2 1 2 2 2 2 2 2 2 2 maf = 19/20 = 0.95

Shouldn't the second row be considered to have a MAF of 0.05? Then, should I simply provide optional arguments to vcftools --maf 0.05 and --max-maf 0.95? Or is there a better way to do this?  



sequencing snp vcftools • 4.9k views
ADD COMMENTlink modified 3.8 years ago by trausch1.4k • written 3.8 years ago by Tim0
gravatar for trausch
3.8 years ago by
trausch1.4k wrote:

Part of vcftools is fill-an-ac

The variant allele frequency (AF) you can afterwards calculate as AC/AN (both new INFO fields from fill-an-ac).

Minor allele frequencies indeed range from 0 - 0.5 and you can then derive the minor allele from the INFO:AF value.


ADD COMMENTlink written 3.8 years ago by trausch1.4k

Thanks for info. Just for my understanding, if there are two possible alleles for a position, this would be identical to calling --maf 0.05 and --max-maf 0.95 when using vcftools? Also, in this case, we could simply transform/scale MAF (as outputted from vcftools) originally ranged between 0.5 and 1 to between 0 and 0.5?

ADD REPLYlink written 3.8 years ago by Tim0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1925 users visited in the last hour