Question: Why is gnomAD AF vs gnomAD exome AF so different?
0
gravatar for nub6
10 weeks ago by
nub620
nub620 wrote:

Hi,

I am trying to filter my exome data to find variants with an allele frequency <0.05 in gnomAD but notice that gnomAD AF vs gnomAD exome give opposite results!

e.g. rs73976541 allele C on gnomAD exome is 0.005 however on gnomAD AF is >0.9

Why is this? I have WES data using GRCh38

Thanks!

gnomad next-gen wes • 246 views
ADD COMMENTlink modified 10 weeks ago by Jorge Amigo12k • written 10 weeks ago by nub620

Perhaps you're looking at genome and exome AF!

ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by brunobsouzaa490

but shouldn't they be the same?

ADD REPLYlink written 10 weeks ago by nub620
2

Note that although it's true that the number of genomes and exomes in gnomAD are not the same, and therefore the allele frequencies may not be exactly identical, they should be at least similar. I agree that 0.005 and 0.02 can be considered as not that similar when dealing with really rare variants, but if you see things like the one you saw (0.005 vs 0.9), be sure that the frequency does not refer to the same allele. And if it does, you should report it to the gnomAD team for them to correct it.

ADD REPLYlink written 10 weeks ago by Jorge Amigo12k
1

Actually not. Take a look at gnomAD FAQ. The number of samples in Exome and Genome studies is very different!

ADD REPLYlink written 10 weeks ago by brunobsouzaa490

Oh thanks. That is useful to know :)

ADD REPLYlink written 10 weeks ago by nub620

The sample count really varies at intergenic regions too! Might get one or two samples from the WES dataset calling SNP at a site and all of the WGS, leaving very unbalanced frequencies.

ADD REPLYlink written 10 weeks ago by karl.stamm3.9k
4
gravatar for Jorge Amigo
10 weeks ago by
Jorge Amigo12k
Santiago de Compostela, Spain
Jorge Amigo12k wrote:

Quick answer: you're dealing with different references, therefore the reference alleles can differ, therefore the frequencies can differ too.

If you check rs73976541 on gnomAD v2.1.1 you're using GRCh37, where the change of the reference C for an alternative T occurs at a global frequency of ~0.005 and ~0.02 respectively in the exomes and genomes analyzed.

If you check rs73976541 on gnomAD v3.1 you're using GRCh38, where the change of the reference T for an alternative C occurs at a global frequency of ~0.9795 in the genomes analyzed.

So changing from GRCh37 to GRCh38 changed the reference allele for rs73976541! Unexpected for the untrained eye, but it can happen as you can see yourself on dbSNP. The take home message would be that the frequency refers to a particular allele, and not to the variant as a whole.

ADD COMMENTlink modified 10 weeks ago • written 10 weeks ago by Jorge Amigo12k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1755 users visited in the last hour
_