Double OR operator (||) in bcftools
0
1
Entering edit mode
11 weeks ago

In the man page for bcftools expressions, the difference between | and || is described as follows:

QUAL>10 | FMT/GQ>10 .. true for sites with QUAL>10 or a sample with GQ>10, but selects only samples with GQ>10

QUAL>10 || FMT/GQ>10 .. true for sites with QUAL>10 or a sample with GQ>10, plus selects all samples at such sites

However, I don't seen any difference in the output from these two commands, which filter out sites with more than 2 alleles, or genotype depth <= 20:

bcftools filter unfiltered.bcf.gz -e 'N_ALT >= 2 | FMT/DP<=20' | bcftools query -l | wc -l  #672
bcftools filter unfiltered.bcf.gz -e 'N_ALT >= 2 || FMT/DP<=20' | bcftools query -l | wc -l  #672

As I understand the manual, the first command should filter out samples with depth <=20. I can separately test to confirm that indeed, there are samples which have read depth <20 at many markers:

bcftools view -H unfiltered.vcf.gz | wc -l #9989 
bcftools filter unfiltered.bcf.gz -e 'FMT/DP<=20' | bcftools view -H | wc -l #97

Any advice on how I am misinterpreting | and || would be appreciated. Thanks!

bcftools SNP filtering • 152 views
ADD COMMENT
1
Entering edit mode

This question is related to a broader question regarding filtering by sample attributes, instead of site attributes, as described here: https://github.com/samtools/bcftools/issues/1391. Any suggestions on alternative methods for filtering by samples would be appreciated.

ADD REPLY

Login before adding your answer.

Traffic: 2385 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6