Using sorted and mapped reads, is there a way to count the base pair mismatches within a region
0
0
Entering edit mode
18 months ago
ctatters • 0

I have BAM files that are coordinate sorted and I have taken a subset of the reads so that I am only look at chromosome X between 153759406 and 153784524 on build 19.

samtools -f 2 -o sample1.sortedByCoord.chrX.G6PD.mated.bam sample1.sortedByCoord.bam X:153759406-153784524
samtools -F 4 -o sample1.sortedByCoord.chrX.G6PD.mated.and.mapped.bam sample1.sortedByCoord.chrX.G6PD.mated.bam

I am looking at even smaller subset of reads within this broader set, X:153760219-153760583 under IGV and I am noticing a lot of positions within reads that have mismatches

enter image description here

None of these positions ended up in my vcf file as called variants (which is likely correct). I am curious about where the mismatches are and which orientation the read is on for the mismatches. To determine the orientation of the read I could just use flags with samtools, but how can I bin the mismatches by position?

bam-files whole-exome-sequencing igv • 252 views
ADD COMMENT

Login before adding your answer.

Traffic: 2125 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6