Proper way of calculating minor allele frequencies (via vcftools)
1
0
Entering edit mode
5.4 years ago
Tim • 0

Hi Biostars,

I would like to learn a proper way of calculating minor allele frequencies , when doing population genetics. With a large sample (more than 100) of whole-genome sequences in VCF formats, I use vcftools. I use optional arguments (--maf and --max-maf) to limit/filter SNPs with a particular range of MAFs. For example, I do not want MAFs less than 0.05, so I use this command:

vcftools --vcf xxx.vcf --out SNP --remove-indels --maf 0.05 --012

However, this would result in a genotype file (012 format) with MAFs ranging from 0.05 to 1, since there may be site locations (genome positions) that mostly match alternatives (compared to ref genome). I thought that MAFs should range between 0 and 0.5. For example, vcftools computes like this:

0 0 0 0 0 0 1 0 0 0 maf = 1/20 = 0.05
2 1 2 2 2 2 2 2 2 2 maf = 19/20 = 0.95

Shouldn't the second row be considered to have a MAF of 0.05? Then, should I simply provide optional arguments to vcftools --maf 0.05 and --max-maf 0.95? Or is there a better way to do this?  

Thanks!

 

sequencing SNP vcftools • 6.5k views
ADD COMMENT
1
Entering edit mode
5.4 years ago
trausch ★ 1.6k

Part of vcftools is fill-an-ac

http://vcftools.sourceforge.net/perl_module.html#fill-an-ac

The variant allele frequency (AF) you can afterwards calculate as AC/AN (both new INFO fields from fill-an-ac).

Minor allele frequencies indeed range from 0 - 0.5 and you can then derive the minor allele from the INFO:AF value.

ADD COMMENT
0
Entering edit mode

Thanks for info. Just for my understanding, if there are two possible alleles for a position, this would be identical to calling --maf 0.05 and --max-maf 0.95 when using vcftools? Also, in this case, we could simply transform/scale MAF (as outputted from vcftools) originally ranged between 0.5 and 1 to between 0 and 0.5?

ADD REPLY

Login before adding your answer.

Traffic: 2020 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6