Filtering a Multi-Sample VCF for variants where at least one sample meets the given conditon(s)
I want to filter a multi-sample for variants where at least one sample for said variant has a DP > 50, AB < 0.3 and GQ > 30. I am only (somewhat) familiar with vcftools, which I don't believe can do this. I was wondering if some one could suggest a tool which can?

I have defined AB as: Alt/(Ref+Alt)

5.9 years ago

using vcffilterjs:

 java -jar jvarkit-git/dist/vcffilterjs.jar -e 'function accept(v) { var  i=0;for(i=0;i< v.getNSamples();++i) {var g=v.getGenotype(i);if(g.hasDP() && g.hasGQ() && g.getDP()>50 && g.getGQ()>40) return true;} return false;}accept(variant);'  input.vcf

Hi Pierre,

I have edited my question due to a significant typo.

I forgot to mention that I also need to filter for variants where at least one sample has an AB < 0.3 as well as the previously stated DP > 50 and GQ > 30 . Can your tool do this? Looking at your documentation, a minimum allele balance filter can be applied but not a maximum.

Thanks again

.. && g.getAttributeAsDouble("AB",10.0) < 0.3 ...

5.8 years ago
Len Trigg ★ 1.6k

Similarly using RTG Tools:

rtg vcffilter --input in.vcf --output output.vcf --keep-expr 'SAMPLES.some(function(s) {return has(s.DP) && hass.GQ) && has(s.AB) && s.DP>50 && s.GQ>30 && s.AB<0.3})'


If AB isn't a pre-existing field, but something you want to compute on the fly, you can do that too (allelic fraction is an example that is given in the user manual).

(edit: biostars rendering is weirdly broken, showing has ( s.GQ) as hass.GQ) in the above line)