I am trying to find some statistics of mismatches and indels from SAM/BAM file. The SAM file is generated using BWA. The statistics should include the %mismatch and %indel for each aligned reads. I am wondering if there are any good tools I could use.
You can also try alfred. It needs a sorted & indexed BAM file and the reference genome you used for the alignment.
alfred qc -r <reference.fa> <align.bam>
It computes the error rates you are looking for and some other metrics (insert size, coverage, ...).
BBMap's Reformat tool can produce some of these statistics:
ehist=<file> Errors-per-read histogram. qahist=<file> Quality accuracy histogram of error rates versus quality score. indelhist=<file> Indel length histogram. mhist=<file> Histogram of match, sub, del, and ins rates by read location. ihist=<file> Insert size histograms. Requires paired reads interleaved in sam file. idhist=<file> Histogram of read count versus percent identity.
BBMap also prints out a summary of match, mismatch, insertion, and deletion rates when it runs. But I think you can get most of what you want with Reformat, particularly, the mhist output.