I have BAM files that are coordinate sorted and I have taken a subset of the reads so that I am only look at chromosome X between 153759406 and 153784524 on build 19.
samtools -f 2 -o sample1.sortedByCoord.chrX.G6PD.mated.bam sample1.sortedByCoord.bam X:153759406-153784524
samtools -F 4 -o sample1.sortedByCoord.chrX.G6PD.mated.and.mapped.bam sample1.sortedByCoord.chrX.G6PD.mated.bam
I am looking at even smaller subset of reads within this broader set, X:153760219-153760583 under IGV and I am noticing a lot of positions within reads that have mismatches
None of these positions ended up in my vcf file as called variants (which is likely correct). I am curious about where the mismatches are and which orientation the read is on for the mismatches. To determine the orientation of the read I could just use flags with samtools, but how can I bin the mismatches by position?