Entering edit mode
4.2 years ago
zizigolu
★
4.3k
Hi
I have calculated Variant Allele Frequency (VAF)
for called SNVs and INDELs called by Strelka separately. For getting VAF, I done
VAF = Tumour Variant Allele Count / Tumour Read Count
For some position of the genome I have VAF > 30
, so DOES these big VAFs are normal or I am doing something wrong?
I was supposing VAFs should be in the range of 0 < VAF <1
Can you help in getting some idea?
Thanks
Yes. I don't see how VAF can be >1. There's something wrong in your calculations.
Did you follow the instructions to calculate VAFs as suggested in one of your previous posts (=as in the Strelka manual)? I guess not given this result.
Actually somebody wrote a script for me; Assuming a Strelka .vcf for SNV
No wonder you're running into problems you can't explain. You can execute each statement line by line and see where the logic goes awry, or you can contact the author and hope they have the time to explain what could be going wrong. I'd recommend the former approach.
Please read the Strelka manual towards calculating AFs. The way one calculates this is different for Indels and SNPs, so running one script (the one below) on both is not going to work. For Indels you need the TIR and TAR values while for SNPs you will have to extract something different, I do not remember. I fugired it out back in the day entirely by reading the manual, I am sure you can do that as well.
For INDELs he has written
He believes that
In cases where the frequency is above 100% or 1, this is likely an error where there is more information in support of the variant than there is read depth???. In these cases, you could consider the frequency to be around 1.
Sorry @ATpoint, I googled but I failed to find a full documentation explaining what each part of a vcf from Strelka means especially INFO column. Please can you share if you found such documentation?