1
0
Entering edit mode
11 months ago
Eugene A ▴ 120

Hi, I've accidently noticed that an SNP with close to 1 allele frequency in Gnomad exom does not show up in gnomad Genome and that is something that turns out to have a consequence for the automatic ACMG annotation.

The SNP is https://www.ncbi.nlm.nih.gov/snp/rs142200057 It has almost zero frequency of reference allele in any database, but OK, I can leave with it...BUT: Thing is, that the reference was never found in gnomadV3. And when I'm performing annotation with some annotator using gnomadV3 (and not V2) it tells me that current snp is not found in Gnomad.v3 therefore it seems to be extremely rare and therefore might be pathogenic (which is absolutely not the case)

My question is - why rs142200057 did not shows up in GnomadV3? It seems that due to the fact that Ref was never seen in that position it was filtered out due to some pipeline bug? However, when Ref has at least some frequency (Gnomad V2 exoms) everything is correctly represented.

0
Entering edit mode

gnomAD v3 is Whole Genome Sequencing only - it does not contain exomes. The WGS data used in v2.1.1 is realigned to GRCh38 in v3. A variant not seen in v2 might not appear in v3.

gnomAD Exomes is datasets obtained from Whole Exome Sequencing. I think there' no overlap between gnomAD exomes and gnomAD genomes. There is no v3 exomes yet.

0
Entering edit mode

Hi, thanks for the comment. I do understand that gnomadV3 does not contain exomes. BUT it must contain the variant that has frequency of alt allele of AF=9.99979e-01 in gnomadV2 (at least I do not see any reason why not).

0
Entering edit mode

True - that is indeed very odd. I apologize for the delay in my response. You should email the MacArthur lab (or the gnomAD Contact address if there's one) with your question.

0
Entering edit mode

Hi, yes, I did drop an email to Gnomad contact adress couple of days ago but still waitig for the response

2
Entering edit mode
10 months ago
Eugene A ▴ 120

So I eventually figure out the answer, with a help from gihub (https://github.com/broadinstitute/gnomad-browser/issues/602) and will briefly explain it here:

There is a sequence change between GRCh37 and GRCh38 in the position of interest:

AGGCTT - GRCCh37
AGGGCTT - GRCh38


Therefore what was an alt in 37 (with freq of 9.99979e-01) became an Ref in 38 with the same frequency close to 1. These frequencies correctly represented in Gnomad_exom_V2 AND Gnomad_GenomeV3 (alt variant is simply so rare that is absent there).

BUT the problem begins with the liftover of .vcf file from 37 to 38 build. Because what was GGG (completely normal genotype in 37) becomes a GGGG in 38 (which actually was never seen as far as I undestand). Therefore Gnomad_exomV2liftover contain a wrong info , sayng that this postion harbours GGGG phenotype with frequency of 9.99979e-01.

And I belive that this is a reason why current ClinVar record (https://www.ncbi.nlm.nih.gov/snp/rs142200057) reports frequencies of GGG close to 0, when actually it has to be close to 1.

Hope I did not mess it up :) lesson: Gnomadv2_liftover is a bad thing, liftover frequencies might be misleading