bcftools plugin fill-tags with functions
Hello, biostars community!

I have been struggling lately with bcftools to add a customized tag to my vcf header so I can then filter using it. I've read the bcftools manual many times and I've found this information:

TAG=func(TAG) Number:1 Type:Integer .. Experimental support for user-defined expressions such as "DP=sum(DP)". This is currently very basic, to be extended Therefore, I'm trying to add the average variant level DP calculated from the per-sample DP using the example provided in the manual. So far I've tried

bcftools +fill-tags in.vcf.gz -Oz -o out.AVGDP.vcf.gz -- -t 'VD=AVG(DP)' where VD= variant depth. I'm getting this error

Error: the expression not recognised: VD=AVG(DP)

I have tried replacing VD=AVG(DP) for VD=MEAN(DP), VD=avg(DP), VD=AVG(FMT/DP) but none of these syntax is working for me. I was able to calculate this VD=sum(DP) but so far it is the only function working. I've also tried min, max...etc.

I would appreciate some thoughts or recommendations on this issue. Thank you

dc

It sounds like it's an experimental option - perhaps your version doesn't support it? What version of bcftools are you using?

6 weeks ago

'sum' is the only currently supported function: https://github.com/samtools/bcftools/blob/develop/plugins/fill-tags.c#L349

using vcfilterjdk: http://lindenb.github.io/jvarkit/VcfFilterJdk.html

awk '/^#CHROM/ {printf("##INFO=<ID=VD,Number=1,Type=Float,Description=\"todo\">\n");} {print}' in.vcf | \
java -jar \${JVARKIT_DIST}/vcffilterjdk.jar -e 'return new VariantContextBuilder(variant).attribute("VD",variant.getGenotypes().stream().filter(G->G.hasDP()).mapToInt(G->G.getDP()).average().orElse(-1.0)).make();'

6 weeks ago
sbstevenlee ▴ 140

If you are a Python user, take a look at the pyvcf submodule I wrote, specifically pyvcf.VcfFrame.add_dp and pyvcf.VcfFrame.add_flag. If you can't find what you need, let me know and I'd be more than happy to implement a specific method that does the job.

