How can I filter a vcf filter a VCF file on minimum genotype depth and genotype quality for each sample.
I am looking for a way to filter variants from a VCF file by checking that all samples for a site pass 2 critera
sample.DP > 10
sample.GQ > 15
I thought I could do this using bcftools
bcftools view -i 'MIN(FMT/DP>10) && MIN(FMT/GQ>15)' my.vcf.gz
Somehow this include expression does not seem to be applied.
In my output are still lots of variants with genotypes likes
GT:AO:DP:GQ:PL:QA:QR:RO
0|0:0:16:3:0,48,502:0:554:16 (=depth 16, quality 3)
1|1:1:1:3:21,3,0:37:0:0 (=depth 1, quality 3)
I tried putting the include commands in two different bcftools commands
bcftools view -i 'MIN(FMT/DP>10)' | bcftools view -i 'MIN(FMT/GQ>15)' my.vcf.gz
And I tried without the FMT prefix for the genotype quality
bcftools view -i 'MIN(FMT/DP>10) && MIN(GQ>15)' my.vcf.gz
Still I get variants back with genotypes that don't match the criteria.
Am I misunderstanding the bcftools include expression documentation?
https://samtools.github.io/bcftools/bcftools.html#expressions
Or is there another way that I can achieve this filtering step?
Cools seems to do the job. Thanks.
At closer look it's not exactly what I was looking for. This command sets the genotypes that don't match the criteria to ./. . I am looking to remove the whole variant if not all genotypes pass the criteria. .
Does anyone know if there is a way to do this (set genotypes that don't match the criteria to ./.) in bcftools? I'd like to set low-quality genotypes to missing in a multi-sample vcf but keep the sites that contain them, and can't seem to work out how. The vcftools solution works fine, but I'm curious to know if it can be done in bcftools too.
from the above script, --minGQ and --minDP take the value of 15 and 10 respectively for filtering genotype quality and depth. Is there any criteria for choosing 15 and 10.. Just to know if 15 and 10 are standard values to use for those flags.