How to interpret variant allele frequency?
23 months ago
I am a bioinformatics student and completely new to NGS analysis (and also to biological reasoning). I would like to understand what is all about the variant allele frequency (VAF).

Why is it important to know whether the VAF is highly enough? What VAF can be considered highly enough?

Is there a specific cutoff for VAF that can be used to determine whether a variant is clonal or subclonal, assuming a 90% tumor purity?

I tried googling it but could not figure out a satisfying answer. Appreciate your effort.

VAF MAF
Variant allele frequency (VAF) = The number of variant allele/The number o Variant allele + The number o Reference allele. 4 / 4+2

I guess VAF > 0.5 considered as clonal in a binomial distribution of VAFs

Are your cancer cells haploid? Nice explanation BTW.

Personally, it is easier for me to think of VAF as "Variant Allele Fraction" than "Variant Allele Frequency". "Allele Frequency" is used a lot in the population context (like in ExAC/gnomAD/1000g), so thinking in terms of "What fraction of my cells (well, haplotypes) have this allele?" helps me picture it better.

I guess VAF > 0.4 considered as clonal in a binomial distribution of VAFs

Do you have a source on this? I'd much appreciate that. We have a lot of high purity tumors (I think), because a lot of the variants we see have TVAF close to 1 - that or they're all germline variants :-)

Here I read

clonal genes usually have mean allele frequency around ~50% assuming pure sample

http://bioconductor.org/packages/devel/bioc/vignettes/maftools/inst/doc/maftools.html

Thank you. maftools is pretty amazing!

23 months ago
d-cameron

Why is it important to know whether the VAF is highly enough? What VAF can be considered highly enough?

The lower the VAF, the more likely the variant is to be a false positive (since it is supported by fewer reads).

Is there a specific cutoff for VAF that can be used to determine whether a variant is clonal or subclonal, assuming a 90% tumor purity?

The cut-off depends on both the purity and the ploidy of the tumour sample. Aneuploidy is extremely common in adult cancers so making a diploid assumption is a very bad idea. Tools such as ascatNGS (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6097604/) will fit jointly fit VAF and copy number to determine the purity and ploidy of the dominant clone. Variants which do not fit any of the expected VAFs (e.g. for CN=3 the expected VAFs are 0, 1/3, 2/3, 1) are likely to be subclonal (or noise).

This area of bioinformatics has been extensively studied and there are many tools in this space.

