Question: Use bcftools to filter a VCF with multiple filter flags
1
gravatar for RamRS
6 weeks ago by
RamRS30k
Baylor College of Medicine, Houston, TX
RamRS30k wrote:

Good afternoon,

I'm trying to filter a VCF file that has the following dummy flag values:

  • PASS: All filters passed
  • Fa: Failed filter a
  • Fb: Failed filter b
  • Fc: Failed filter c
  • Fd: Faield filter d

Variants can fail one or more filters. Variants that fail multiple filters will be annotated with the corresponding flags separated by semi-colon. Thus, the filter column can have one of the 5 above values, or any number of F* values separated by ;.

I'd like to select all variants that either PASSed or only failed filter a. How can I do this in bcftools? The -f option skips location that does not contain one of the listed filters, so it keeps locations that contain any of the listed filters. When I use

bcftools view -f PASS,Fa ...

I get rows that failed filter a along with other filters also. That is, the above expression matches both Fa and Fa;Fb. I tried excluding the delimiter, but that didn't work:

bcftools view -f 'PASS,Fa,;' ... #didn't work

Does anyone know how to exclude or include exactly a list of filters? Nothing in the -i or -e EXPRESSIONS is useful either.

This is what I'm using right now, which is awk mocking bcftools:

zcat vcf_file.vcf.gz | awk -F"\t" -vOFS="\t" '$0 ~ /^#/ {print} $7=="PASS" || $7=="Fa" {print}'
bcftols filter vcf • 233 views
ADD COMMENTlink modified 4 weeks ago • written 6 weeks ago by RamRS30k

Update: I tried this, but it picked up an entry that it was not supposed to pick up:

bcftools view -i 'FILTER=="PASS" | FILTER=="Fa"' vcf_file.gz
bcftools view -i 'FILTER=="PASS" || FILTER=="Fa"' vcf_file.gz

It picked up an entry where the FILTER value was Fa;Fb.

ADD REPLYlink written 6 weeks ago by RamRS30k

Update #2: I've opened an issue on bcftools github: https://github.com/samtools/bcftools/issues/1285

ADD REPLYlink written 6 weeks ago by RamRS30k
3
gravatar for RamRS
4 weeks ago by
RamRS30k
Baylor College of Medicine, Houston, TX
RamRS30k wrote:

This feature was lacking in bcftools, and the developer has now fixed that with this commit: https://github.com/samtools/bcftools/commit/fea8773196878481399183fd9f711685d41e6cf9

Starting bcftools v1.10.3, it should be possible to do this sort of exact filtering.

ADD COMMENTlink written 4 weeks ago by RamRS30k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1524 users visited in the last hour