Testing for absence of tag from info field with bcftools
6.5 years ago
tzhughes • 0

Hi,

My VCF file is annotatated with g1k frequency WHEN such a frequency is available.

I want to select all rare variants (with a frequency below a certain cutoff say 1%). Thus I want all lines where this condition is satisfied BUT I also need all the lines where there is no frequency annotation as these are also potential rare variants.

Does anyone know how to do this with bcftools expressions? I have tried but it would seem like bcftools does not provide a way of testing whether a tag is present in the INFO field.

Tim

I tried this and it does not work.

I think this syntax

MYFLAG=0

can only be used for tags that are true flags ie that do not have a value. Such tags are defined in the header with Number=0 (number of values) to indicate that they do not have a value. In my case when the frequency is available it is a tag with a value, and when it is not available nothing appears.

I suppose in a way the frequency tag is being used partly as a true flag and partly not. Not my design, but the way it is implemented by several pieces of software.

Tim

I see. in the end, I would use my tool based on javascript: https://github.com/lindenb/jvarkit/wiki/VCFFilterJS see below.

6.5 years ago

did you try a filter expression ? http://samtools.github.io/bcftools/bcftools.html#expressions

1 (or 0) to test the presence (or absence) of a flag

FlagA=1 && FlagB=0

associated to an OR expression:

 logical operators

&& (same as &), ||,  |

associated to a comparison operator ?

== (same as =), >, >=, <=, <, !=

should be something like (not tested)

MYFLAG=0 ||  MYFLAG<0.1

EDIT: with vcffilterjs

 cat your.vcf | java -jar dist/vcffilterjs.jar -e '(!variant.hasAttribute("ATT") || variant.getAttributeAsDouble("ATT",999) < 0.2 )'

6.5 years ago
tzhughes • 0

I got the answer from Petr on the samtools list:

The -i and -e options are complementary and choosing one or the other
allows you to include / exclude records without the tag. In your case,
this should work:

bcftools view -e'AF>0.99'

I found out that you cannot use -i and -e in the same call to bcftools, but you can use one of them in the first call and pipe the result to a second call to bcftools where you use the second one.

17 months ago
Sander G • 0
bcftools view -i "FREQUENCYTAG<='cutoffvaue' || FREQUENCYTAG='' "


leaving this here for future readers