Question: Disable Soft-Clipping Bwa Aln
2
gravatar for Irsan
6.4 years ago by
Irsan6.9k
Amsterdam
Irsan6.9k wrote:

Is there a way to tell bwa aln to do not soft-clip reads and try to align them?

We see that many soft-clipped reads result in false positive variant calls. In the attached IGV snapshot , top track, you see an example of a (PCR duplicated) read that is mapped with 4 substitutions, 1 deletion and 2 insertions. We think that the resulting variants are not true but caused because the read should not have been mapped there (it happens in all 20 samples!).

When we go into the bam-file we see that the read was soft-clipped dramatically (CIGAR 1S18M1D3M1I1M2I11M114S, meaning 115 out of 151 bases were soft-clipped). When I remove all reads from the bam-file that have a CIGAR-string including S (soft clipping) than all the wrongly mapped reads disappear (see IGV snapshot, bottom track).

We believe, at least for our specific study design/questions, we do not want bwa to soft clip reads and try to align them. Is there a way to do so? Or is there a way to remove soft-clipped reads from a bam-file (that includes updating other information like sam-flags up the paired-read to the right numbers?)

bwa • 6.8k views
ADD COMMENTlink modified 6.4 years ago by toni2.1k • written 6.4 years ago by Irsan6.9k
3
gravatar for toni
6.4 years ago by
toni2.1k
Lyon
toni2.1k wrote:

Yes, you have a -s option that allows you to tell BWA that it must not try to realign with Smith-Waterman.

This option is to be supplied in bwa sampe, not in bwa aln.

ADD COMMENTlink modified 6.4 years ago • written 6.4 years ago by toni2.1k

Exactly what I was looking for, thanks.

ADD REPLYlink written 6.4 years ago by Irsan6.9k
1

Disabling soft clipping is fine for genomic alignment when you have more than required coverage. But normally you should not disable it. One thing you can do is write a script that filters reads with a lower ratio of number of bases aligned to the total length of the read. You can set it to 0.75 ans see if it works for you. Also, I dont think the soft clipped region is used for variant calling at all. I assume whatever you are doing is for the visualization purpose.

ADD REPLYlink written 6.4 years ago by Ashutosh Pandey11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1240 users visited in the last hour