Extrapolating allele frequencies from gnomad
2.2 years ago
oakhamwolf ▴ 20

Hi all,

I am playing around with gnomad data and wondering if there is an accepted way to extrapolate allele frequencies from gnomad out to get an approximate number of people affected in a given population.

My efforts have been based on using the Hardy Weinberg formula but I don't think I am using it correctly.

For example, in gnomad the variant ABCA4:c.5461-10T>C which is linked to the autosomal recessive Stargardts disease has a minor allele frequency of 0.0002272 in the non-Finnish European population. A quick google gave me over 700M for Europe and 6M in Finland leaving me with 694M individuals. Applying the Hardy Weinberg equation to this data I get the following:

F(q) = 0.0002272; F(p) = 0.9997728

F(pp) = 0.99954; F(pq) = 0.00022714; F(qq) = 0.00000000516

So taking the predicted hom_alt (autosomal recessive) and applying that to the 694M individuals I get 36 individuals. 1 in 8-10 thousand people have Stargardt disease according to https://nei.nih.gov/health/stargardt/star_facts so this should be closer to 87000.

Would someone be able to point me in the right direction or maybe tell me whether or not this is feasible? Many thanks for any help in advance.


gnomad allele frequency • 921 views
As a first quick comment: Hardy weinberg equilibrium is only valid for alleles on which no natural selection happens. Therefore you probably shouldn't use it for disease causing variants.


