Question: Bwa Indels
gravatar for j_waterfall
6.7 years ago by
j_waterfall20 wrote:

Comparing bwa 0.6.2 to bwa 0.5.9 I'm seeing very different behavior around indels. I'm using default options in both cases and as far as I can tell they haven't changed. I'm aligning Illumina paired end 2x80bp reads. For dbSNP indels of just a few nucleotides bwa 0.5.9 aligns them great - perfect matches for entire read length (plus the indel, except when the indel is near the end of the read) but 0.6.2 soft clips a large fraction of the read and then has a very high number of mismatches plus multiple indels in the remaining portion. Any suggestions?

indel bwa • 3.0k views
ADD COMMENTlink modified 6.7 years ago by Erik Garrison2.2k • written 6.7 years ago by j_waterfall20
gravatar for Erik Garrison
6.7 years ago by
Erik Garrison2.2k
Somerville, MA
Erik Garrison2.2k wrote:

I noticed the same issue. It seems that BWA is more aggressively soft-clipping reads around indels in recent releases. This is fine, provided you can realign the problematic reads.

To resolve issues with indel alignment in a relatively uniform and aligner-agnostic way, I developed a repeat and entropy-aware local realignment system, ogap. In our detection pipeline we use it like this:

bamtools merge -in file1.bam -in file2.bam -in file3.bam \
    | ogap -z -R 25 -C 20 -Q 20 -S 0 -f $reference \
    | bamleftalign -f $reference \
    | samtools calmd -EAru - $reference 2>/dev/null \
    | freebayes -f $reference >results.vcf

In this setting, it picks up reads which have soft clipped or mismatched bases summing to more than Q20, and realigns using an extended Smith-Waterman-Gotoh pairwise alignment algorithm that adjusts gap opening and extension penalties on the basis of local sequence context. To do this, it uses Shannon entropy and perfect repeats up to 12bp. This may be too sensitive for most use, so adjusting -C and -Q higher may improve performance.

ADD COMMENTlink written 6.7 years ago by Erik Garrison2.2k

Thanks, I'll check out ogap. I'm pretty sure the GATK realigner ignores soft-clipped bases so it's not very useful at these positions.

ADD REPLYlink written 6.7 years ago by j_waterfall20

Eric, is it possible to force ogap to realign only soft-clipped reads (all of them if possible) but nothing else.

ADD REPLYlink modified 6.4 years ago • written 6.4 years ago by oadaily20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 886 users visited in the last hour