Calculate Depth with and without Soft-Clipping
1
0
Entering edit mode
6.0 years ago
ccnn • 0

Hi guys,

I'm trying to use GATK3 (since v4 doesn't have the DepthOfCoverage tool yet) to calculate the depth of coverage across some WGS BAM files, with and with out soft clipping.

Currently, I'm using the following the following

#calculate depth, exclude soft-clipped bases
java -Xmx${mem}g -jar $jar -mbq 0 -mmq 0 -L ${bed} -o $out/${bamName}.doc -I $bam  -T DepthOfCoverage -R $ref

#calculate depth, include soft-clipped bases
java  -Xmx${mem}g -jar $jar -T ClipReads -l INFO -I $bam -o $bam.noclip.bam -R $ref -CR REVERT_SOFTCLIPPED_BASES -os clip.stats 2>> err.sc >> out.sc
java  -Xmx${mem}g -jar $jar -mbq 0 -mmq 0 -L $bed -o $out/${bamName}.noclip.bam.doc -I $bam.noclip.bam  -T DepthOfCoverage -R $ref

Basically, what I'm asking GATK to do is: 1) traverse the original BAM file and calculate the depth; 2) convert this original BAM file into a new BAM file with no soft clipping; 3) traverse this new BAM file and calculate the depth.

This is a computationally intensive process; it takes both large amounts of time and memory. And I'm wondering if there's a more efficient way to do it? If there's another tool, like samtools depth, that could accomplish this more efficiently I'm happy to switch to that or anything else.

Thanks!

bam gatk sequencing alignment • 1.4k views
ADD COMMENT
0
Entering edit mode
6.0 years ago

I would say that whatever you're using, it will give no difference because most (all ?) tools don't use the clipped bases.

ADD COMMENT
0
Entering edit mode

I understand that I may have to use GATK -T ClipReads -CR REVERT_SOFTCLIPPED_BASES to create a BAM with no soft-clipping from my original BAM, but won't some ways of calculating depth across this original and modified BAM be more efficient than others?

Alternatively, is there no tool that can, processing a BAM just once, tell me how what the depth w and w/o soft-clipping is at each spot? Or just the depth and the amount of soft-clipping and I'll add these numbers together? Thanks for your help.

ADD REPLY
0
Entering edit mode

but won't some ways of calculating depth across this original and modified BAM be more efficient than others?

it sounds like microoptimization and you'll consume some time and storage to run GATK

Alternatively, is there no tool that can, processing a BAM just once, tell me how what the depth w and w/o soft-clipping is at each spot?

what would be the logic in this ? if a part of the read is mapped on another region of the genome, why would you need to consider this ?

ADD REPLY

Login before adding your answer.

Traffic: 2640 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6