Question: How to get sequencing depths from VCF with Rsamtools
0
gravatar for MMa
13 months ago by
MMa240
MMa240 wrote:

I got from our collaborator some pindel output and the BAM files that is used as pindel input. Unfortunately, they used an older of version of pindel (before approximately 0.2.4u) that doesn't record total depth at breakpoint (i.e. the DP field in VCF). While our collaborator's standard practice is to obtain depths through Rsamtools, I haven't found a good way to do this. Specifically, I know how to read sequencing depths to an Rle objects, and on the other end convert pindel outputs to VCF and then to GenomicRanges, but how do I integrate the two?

pindel rsamtools R • 568 views
ADD COMMENTlink modified 12 months ago by Pierre Lindenbaum111k • written 13 months ago by MMa240

"By design, the call annotations in the VCF reflect the data that the caller/genotyper actually used in their calculation. " how could you known the number of reads used by pindel ?

ADD REPLYlink written 13 months ago by Pierre Lindenbaum111k

The docs for the current versions of pindel states depth is the same I'd get from samtools depth.

ADD REPLYlink written 13 months ago by MMa240
0
gravatar for Pierre Lindenbaum
12 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum111k wrote:

I've written a tools to update the DP fields from a set of BAM files: http://lindenb.github.io/jvarkit/FixVcfMissingGenotypes.html

$ find ~/src/gatk-ui/testdata/ -name "*.bam" > input.list

$ tail -2 input.vcf
rotavirus   1064    .   G   A   21.5606 .   DP=250;VDB=2.70971e-16;SGB=8.40135;RPB=0.935144;MQB=1;BQB=0.683886;MQ0F=0;AF1=0.25;G3=0.75,2.37734e-17,0.25;HWE=0.033921;AC1=2;DP4=0,219,0,31;MQ=60;FQ=22.8019;PV4=1,1.22605e-06,1,1    GT:PL   0/0:0,244,70    0/0:0,199,65    0/0:0,217,68    1/1:69,84,0
rotavirus   1064    .   G   A   21.5606 .   DP=250;VDB=2.70971e-16;SGB=8.40135;RPB=0.935144;MQB=1;BQB=0.683886;MQ0F=0;AF1=0.25;G3=0.75,2.37734e-17,0.25;HWE=0.033921;AC1=2;DP4=0,219,0,31;MQ=60;FQ=22.8019;PV4=1,1.22605e-06,1,1    GT:PL   ./. ./. ./. ./.

$ java -jar dist/fixvcfmissinggenotypes.jar -d 50 --fixDP --filtered zz -B input.list input.vcf | tail -2
rotavirus   1064    .   G   A   21.56   .   AC1=2;AF1=0.25;BQB=0.683886;DP=188;DP4=0,219,0,31;FQ=22.8019;G3=0.75,2.37734e-17,0.25;HWE=0.033921;MQ=60;MQ0F=0;MQB=1;PV4=1,1.22605e-06,1,1;RPB=0.935144;SGB=8.40135;VDB=2.70971e-16    GT:DP:PL    0/0:48:0,244,70 0/0:63:0,199,65 0/0:53:0,217,68 1/1:24:69,84,0
rotavirus   1064    .   G   A   21.56   .   AC1=2;AF1=0.25;BQB=0.683886;DP=72;DP4=0,219,0,31;FQ=22.8019;G3=0.75,2.37734e-17,0.25;HWE=0.033921;MQ=60;MQ0F=0;MQB=1;PV4=1,1.22605e-06,1,1;RPB=0.935144;SGB=8.40135;VDB=2.70971e-16 GT:DP:FT:FXG    ./.:48:PASS 0/0:63:zz:1 0/0:53:zz:1 ./.:24:PASS
ADD COMMENTlink written 12 months ago by Pierre Lindenbaum111k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1082 users visited in the last hour