Setting positions in samples as missing using bcftools
2
0
Entering edit mode
4.3 years ago
ucbtsm8 ▴ 20

I'm working with imputed data in multisample bcf format which contains a FORMAT/GP field for each sample.

I would like to set to missing (./.) a genotype in a sample where the max(FORMAT/GP) is less than a given threshold (say 0.99). For instance, 0.1,0.8,0.1 would be set as missing, whereas 1,0,0 wouldn't.

I would like to do this in bcftools, using the +setGT plugin. I've tried doing this using several commands such as:

bcftools +setGT input.bcf -- -t q -n . -i'max(FORMAT/GP)>0.99'

However, this seems to work by checking if any sample at a position has a max value of less than 0.99, and then sets all genotypes at that position as missing. To be clear, want this to run on a per-sample basis only set GTs as missing in individuals that have a max value of less than 0.99, rather than all of them.

Can anyone help me with the command I need?

bcftools software error • 2.1k views
ADD COMMENT
0
Entering edit mode
4.3 years ago

using vcffilterjdk. http://lindenb.github.io/jvarkit/VcfFilterJdk.html

and the following code (hard to test without seeing the vcf)

final String SAMPLE="B00GWGD";
final String TAG="GP";

List<Genotype> genotypes = new ArrayList<>();
for(Genotype gt: variant.getGenotypes()) {
    if(!gt.getSampleName().equals(SAMPLE) ||
        gt.isNoCall() || 
        !gt.hasAnyAttribute(TAG) ||
        ((List<?>)gt.getAnyAttribute(TAG)).stream().
            map(O->new Double(O.toString())).
            anyMatch(V->V.doubleValue()>=0.99)) {
        genotypes.add(gt);
        }
    else
        {
        genotypes.add(GenotypeBuilder.createMissing(gt.getSampleName(),gt.getPloidy()));
        }
    }

return new VariantContextBuilder(variant).genotypes(genotypes).make();

usage:

java -jar dist/vcffilterjdk.jar --recalc -f biostar.code the.vcf
ADD COMMENT
0
Entering edit mode
4.3 years ago
cedcoul • 0

The issue comes from the setGT plugin. I think it does not understand 'max' in expression.

But you can use another plugin called 'tag2tag', in your case (Max(GP)>0.99):

bcftools +tag2tag input.bcf -- -t 0.01 --gp-to-gt
ADD COMMENT

Login before adding your answer.

Traffic: 2618 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6