Question: vcf calculation of allele counts and allele number
12 weeks ago
susan.klein0 wrote:


this is a very basic question but I can't find a proper explanation anywhere.. yes I have read the .vcf specification documents.

How are 'AC' and 'AN' calculated? I have human vcf files where there are homozygote variants with AC=2;AN=2 and heterozygote variants with AC=1; AN=2. If I had multiple samples with multiple variants, how are AC and AN calculated? So does AC ignore the reference allele? The vcf spec is not very helpful with "allele count in genotypes, for each ALT allele, in the same order as listed"?



variant calling • 245 views
12 weeks ago
finswimmer11k wrote:

Hello susan.klein ,

the meaning of AC and ANshould be described in the header of your vcf file. freebayes have written this to my vcf headers:

##INFO=<ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes for each ALT allele, in the same order as listed">
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">

fin swimmer

Yes, that is exactly from the vcf specs. It does not say how they are calculated for multiple samples, e.g. if I have three samples with genotypes 0/1, 1/1, 0/0 what is AC and AN?

In my understanding AC would be 3 (because you just have one ALT allele which is found 3 times) and ANwould be 6 (because you have 3 samples with 2 alleles each).

It's getting more interesting if you have more than one ALT allele. Let's assume the genotype 0/1, 1/1 and 1/2. AC is now 4,1 (4 times ALT allele 1 and 1 time ALT allele 2) and AN is still 6.

