I am analyzing microbial genomes where I need to find out what variants are present in my test samples and not in my control. I called variants in them both using the bcftools mpileup + call pipeline. Now I need to do the following:
1) Remove variants present in control from my test samples.
2) Filter variants
Can someone tell me how to do these? What parameters should I use for filtering? Are there good references out there for this?
Thanks in advance.
Thanks for your answer. I tried searching on google with the term 'parameters for filtering microbial variants' with the aim of finding filtering values based on the INFO field in the VCF file, but was usually directed to papers or tutorials either talking about sources of error and different filtering software. I figured this is a common enough problem and maybe I am not searching with the right keywords so I decided to ask the more experienced users here.
I will definitely look into SnpSift, it seems like a cool tool. Regarding the variants, it turns out that using the GATK prescribed values works well even for microbial genomes (a colleague has used it intensively with good results) so I will now be using that.