Question: Setting positions in samples as missing using bcftools
gravatar for ucbtsm8
6 months ago by
ucbtsm80 wrote:

I'm working with imputed data in multisample bcf format which contains a FORMAT/GP field for each sample.

I would like to set to missing (./.) a genotype in a sample where the max(FORMAT/GP) is less than a given threshold (say 0.99). For instance, 0.1,0.8,0.1 would be set as missing, whereas 1,0,0 wouldn't.

I would like to do this in bcftools, using the +setGT plugin. I've tried doing this using several commands such as:

bcftools +setGT input.bcf -- -t q -n . -i'max(FORMAT/GP)>0.99'

However, this seems to work by checking if any sample at a position has a max value of less than 0.99, and then sets all genotypes at that position as missing. To be clear, want this to run on a per-sample basis only set GTs as missing in individuals that have a max value of less than 0.99, rather than all of them.

Can anyone help me with the command I need?

bcftools software error • 229 views
ADD COMMENTlink modified 6 months ago by cedcoul0 • written 6 months ago by ucbtsm80
gravatar for Pierre Lindenbaum
6 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum129k wrote:

using vcffilterjdk.

and the following code (hard to test without seeing the vcf)

final String SAMPLE="B00GWGD";
final String TAG="GP";

List<Genotype> genotypes = new ArrayList<>();
for(Genotype gt: variant.getGenotypes()) {
    if(!gt.getSampleName().equals(SAMPLE) ||
        gt.isNoCall() || 
        !gt.hasAnyAttribute(TAG) ||
            map(O->new Double(O.toString())).
            anyMatch(V->V.doubleValue()>=0.99)) {

return new VariantContextBuilder(variant).genotypes(genotypes).make();


java -jar dist/vcffilterjdk.jar --recalc -f biostar.code the.vcf
ADD COMMENTlink written 6 months ago by Pierre Lindenbaum129k
gravatar for cedcoul
6 months ago by
cedcoul0 wrote:

The issue comes from the setGT plugin. I think it does not understand 'max' in expression.

But you can use another plugin called 'tag2tag', in your case (Max(GP)>0.99):

bcftools +tag2tag input.bcf -- -t 0.01 --gp-to-gt
ADD COMMENTlink modified 6 months ago • written 6 months ago by cedcoul0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1326 users visited in the last hour