I have some reads aligned with bwa 0.7.15, and I've found some regions with reads ending with soft clipped bases in both ends (example CIGAR string 14S58M28S). I know that soft clipped reads could have a biological meaning, but in this case it seems more like a mapping error to me (only a part of the read has been mapped, while both ends have not). However, it is a primary alignment and the mapping quality seems decent (MAPQ=32), even if it's not exceptionally high.
I have two questions here:
-Am I right with my assumption of these being mapping errors? Is there any other explanation (biological or computational) for this events to happen?
-Is there any tool available for getting rid of double-soft-clipped reads? I could probably grep those lines out from the SAM file, but it will just get too long for applying it to many files.
Thank you all