I have 100 VCF files (100 different samples). I would like to calculate allele frequency in specific sites.
In one specific locus I have three genotypes (GATK best practices workflow):
rs-xxxxx: A/A occurring in 30 samples (ref hom) A/G occurring in 21 samples (het) G/G occurring in 49 samples (alt hom) Frequency of genotype would be: A/A = 0.3 A/G = 0.21 G/G = 0.49
But how do I calculate allele frequency of A/G ?
dbSNP define this like:
(sum of chromosome counts over all member) / (total chromosome counts over all member)
Thank you for any educative example.