Hello everyone,
I am trying to get a better understanding of how the VAF might be related to sequencing depth and wanted to check with everyone here to make sure my analysis is correct. I have seen that some companies are willing to call somatic variants using only a somatic sample without a paired germline sample. My assumption is that they are calling variants and then filtering based upon COSMIC and 1000G or dbSNP to identify variants. I have also heard about people using a VAF of <40% to be evidence of a somatic mutation.
With this in mind I took a germline control sample and proceed it through muTect2 with only a pool of normals as a "control". I then plotted the VAF vs read depth for variants passing the mutect filters which I attached here. In this case it looks to me that until the AD is >100 it is possible to have significant variability in the VAF. This would call in to question any analysis of VAF that did not have at least 100x depth at a loci.
Does my thinking line up with what others have experienced? Is the deviation from 50% or 100% VAF for germline samples intrinsic to particular loci due to experimental issues such as the quality of the library capture or is this more of a sampling issue?