Question: How do I chose variant quality filters for a VCF file?
gravatar for James Reeve
2.2 years ago by
James Reeve100
James Reeve100 wrote:


I'm a Master's student doing bioinformatics for the first time. I have generated a .vcf file of SNPs from whole genome sequencing data using the GATK pipeline. However I'm using a non-model organism (Aulorhynchus flavidus) so I'm missing the true SNP dataset to do GATK's VQSR. My alternative is to hard filter the variants to remove lower quality SNPs, but I'm not sure how stringently to set my filters.


  • What is your approach to choosing the thresholds for hard filtering?
  • How many SNPs should I have to minimize noise (i.e. errors) and maximize signal?
  • Can you point out any good resources on filtering variants?

I know this is an open ended question, and any answer is circumstantial. However without experience, I'm at a lose on how to start. Any advice or references would help a lot.

snp variant filter • 2.6k views
ADD COMMENTlink written 2.2 years ago by James Reeve100


have a look at these two documents. For me it was a good starting point:

fin swimmer

ADD REPLYlink written 2.2 years ago by finswimmer14k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2163 users visited in the last hour