Question: filter multisample VCF based on Altered AD values
0
gravatar for cocchi.e89
10 months ago by
cocchi.e89120
cocchi.e89120 wrote:

I have a multisample VCF, ex. of a line:

1   14464   .   A   T   .   .   ECNT=1;PON;DP=67;MBQ=0,36;MFRL=0,278;MMQ=60,28;MPOS=23;POPAF=0.69;TLOD=29.47    GT:AD:AF:DP:F1R2:F2R1:SB    0/1:0,17:0.947:17:0,9:0,8:0,0,14,3  ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. 0/1:1,25:0.929:26:1,14:0,10:1,0,17,8    ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. 0/1:1,12:0.866:13:0,5:1,6:1,0,5,7   ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. 0/1:0,9:0.912:9:0,4:0,5:0,0,7,2 ./.:.:.:.:.:.:.

What I need is to filter samples based on their Altered AD removing samples with Alt AD < 10. In the example above this would mean to remove the 4th available sample (Alt_AD 9) keeping the first 3, getting something like this:

1   14464   .   A   T   .   .   ECNT=1;PON;DP=67;MBQ=0,36;MFRL=0,278;MMQ=60,28;MPOS=23;POPAF=0.69;TLOD=29.47    GT:AD:AF:DP:F1R2:F2R1:SB    0/1:0,17:0.947:17:0,9:0,8:0,0,14,3  ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. 0/1:1,25:0.929:26:1,14:0,10:1,0,17,8    ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. 0/1:1,12:0.866:13:0,5:1,6:1,0,5,7   ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:.

Is there any available tool for that? I saw vcffilterjs based on this post but it works differently and removes the whole line if none is met and keeps it if at least one pass the filter.

Thank a lot in advance for any help!

ad filter vcf • 238 views
ADD COMMENTlink modified 10 months ago by Pierre Lindenbaum134k • written 10 months ago by cocchi.e89120

What I need is to filter samples based on their Altered AD removing samples with Alt AD < 10. In the example above this would mean to remove the 4th available sample (Alt_AD 9) keeping the first 3, getting something like this:

how could you remove one or more genotype while keeping the structure of the VCF ?

ADD REPLYlink written 10 months ago by Pierre Lindenbaum134k

yes I mean, is there no way to remove genotype entries keeping the structure of the VCF (eliminating if no entries remain) ?

ADD REPLYlink written 10 months ago by cocchi.e89120

well you can reset the genotype to './.' but you cannot remove a genotype. The VCF header with the samples' name would be meaningless + broken.

ADD REPLYlink written 10 months ago by Pierre Lindenbaum134k

of course not removing, sorry, I meant to set it to ./. Is there any tool for that?

ADD REPLYlink written 10 months ago by cocchi.e89120
2
gravatar for Pierre Lindenbaum
10 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum134k wrote:

using VcfFilterJdk http://lindenb.github.io/jvarkit/VcfFilterJdk.html

java -jar dist/vcffilterjdk.jar --recalc -f biostar.code input.vcf.gz

with biostar.code :

return new VariantContextBuilder(variant).
    genotypes( variant.getGenotypes().stream().map(G->{
        if(!G.isCalled()) return G;
        if(!G.hasAD()) return G;
        final int ad[] = G.getAD();
        if(ad==null || ad.length!=2 || ad[1]>=10) return G;
        return  GenotypeBuilder.createMissing(G.getSampleName(),G.getPloidy());
        }).
        collect(Collectors.toList())).
    make();
ADD COMMENTlink written 10 months ago by Pierre Lindenbaum134k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2666 users visited in the last hour
_