Hi - when using VEP to get allele frequency information for a list of SNPs, I ask for the gnomad_AF column. I have noticed however that around a third of my list return an NA value. How should I intepret this? Is that of all combined frequency consoritums across multiple continents, this SNP has never been seen (i.e. Ultra Rare)? Some of the SNPs do appear in dbsnp however, but I guess the exomes/genomes used for gnomad might not coincide with those reported to dbSNP? This seems about right to me - any lab can report to dbSNP, but only subset of the population end up in consotriums such as 100k genomes and exac.
Could anyone provide a definitive answer on this? Can I count them as "ultra rare" if a SNP does not appear in gnomad?
I think it does count as ultra rare if gnomAD doesn't have it. As long as the comparing tool is configured well in comparing multi-allelic sites (as in, you're sure the variant is not in gnomAD), it can be labelled ultra-rare.
I'm not 100% sure, but this question doesn't have an objective answer IMO. "Ultra-rare" is not a globally accepted label, so this is the best we can do to exclude all known variants.
If you're satisfied with my answer, please go ahead and accept it so it's helpful for others with the same question. Thank you!
It may depend on the ethnicity of the sample(s). An Australian Indigenous person will show many apparently novel variants, although most will be ethnic specific and this population is not represented in GnomAD. In this case, the variants are absent from GnomAD, but not necessarily 'rare' in the Australian Indigenous population.
Have you checked if the variants with NA values are absent from the GnomAD browser? i.e. has there been a problem with VEP's handling of GnomAD data in some instances?
If the variants are called from massively parallel sequencing approaches, have you looked at some of the variants with NA values in IGV or similar tool to see if the alignments look correct, or do some variants look like sequencing errors/alignment errors etc. as these would likely be absent GnomAD. This also relates to how you filtered your variants to exclude false positives.
Thanks very much Ram - similar thinking.
I'm not 100% sure, but this question doesn't have an objective answer IMO. "Ultra-rare" is not a globally accepted label, so this is the best we can do to exclude all known variants.
If you're satisfied with my answer, please go ahead and accept it so it's helpful for others with the same question. Thank you!