Is there a way to tell bwa aln to do not soft-clip reads and try to align them?
We see that many soft-clipped reads result in false positive variant calls. In the attached IGV snapshot , top track, you see an example of a (PCR duplicated) read that is mapped with 4 substitutions, 1 deletion and 2 insertions. We think that the resulting variants are not true but caused because the read should not have been mapped there (it happens in all 20 samples!).
When we go into the bam-file we see that the read was soft-clipped dramatically (CIGAR 1S18M1D3M1I1M2I11M114S, meaning 115 out of 151 bases were soft-clipped). When I remove all reads from the bam-file that have a CIGAR-string including S (soft clipping) than all the wrongly mapped reads disappear (see IGV snapshot, bottom track).
We believe, at least for our specific study design/questions, we do not want bwa to soft clip reads and try to align them. Is there a way to do so? Or is there a way to remove soft-clipped reads from a bam-file (that includes updating other information like sam-flags up the paired-read to the right numbers?)