I am using Ensembl VEP (command line) to annotate a VCF I have. I am specifically looking for gnomAD allele frequencies, which is fairly straight forward to do, technically speaking. However, the data looks off in some cases.
For example, when I pass in:
10 69408929 COSM3751912 A T . . GENE=TACR2;STRAND=-;CDS=c.734T>A;AA=p.M245K
I get the VEP output:
COSM3751912 10:69408929 T ENSG00000075073 ENST00000373306 Transcript missense_variant 1278 734 245 M/K aTg/aAg rs55953810 MODERATE - -1 - TACR2 HGNC HGNC:11527 YES ENSP00000362403 P21452 - UPI0000061EE3 - 3/5 - Gene3D:1.20.1070.10,Pfam_domain:PF00001,PROSITE_profiles:PS50262,hmmpanther:PTHR43919,hmmpanther:PTHR43919:SF4,SMART_domains:SM01381,Superfamily_domains:SSF81321,Conserved_Domains:cd16004 1 1 1 1 1 0.9999 0.9999 0.9999 1 0.9999 1 0.9999 1 1 1 gnomAD_ASJ,gnomAD_FIN,gnomAD_OTH,gnomAD_SAS,AFR,AMR,EAS,EUR,SAS - - - - - - -
Jumbling through that, you can see the allele frequencies for
gnomAD_AF is 0.9999. This seems odd to me. How could this variant be a COSMIC (cancer database) missense variant, with
MODERATE consequence, and have 99.99% frequency. I'm lost on how to interpret this.
Maybe I am misunderstanding how gnomAD scores allele frequencies, hence posting this question here.
Does anyone know how gnomAD allele frequencies (as outputted by Ensembl's VEP) should be interpreted?