Hello. I'm new to Bioinformatics working on NGS data. Now I'm working on detecting SNP variants causing multiple sclerosis.
I selected some genes for target resequencing and got NGS data. I put these datas into Qiagen's NGS variant calling pipeline and got vcf files for each samples.
I checked them and realized that these vcf files contain very low variant minor allele frequency(VMF) SNPs. I want to deal with germline variants, so I'm trying to filter these low VMF variants.
I'm using vcftools but I can't solve this problem... I know I have to study more but time is running out so is there any kind person who could help me please...? Thank you.
If you have per-sample VCFs, the only possible minor AF values for bi-allelic variant loci are 0.5 and 1. What do you mean by "very low minor allele frequency"?
Thank you for your comment! I think what you said is the right for germline variant, but here is my vcf file example, my variant contains very low VMF/VF.(0.0740741). Does this mean somatic variant?? In FORMAT, VF is defined as "Variant UMI allele frequency, same as VMF," so I thought I have to eliminate this variant. Am I right?
You did not mention UMIs at all in your initial post - I think UMIs are only involved in single cell sequencing, and I don't really know single cell technologies. Maybe experts on the technology can chime in.