I am trying to identify reads in a BAM file that deviate from the reference, be it SNP or indel, based on a list of targeted SNPs/indels I have.
Using BEDtools intersect or samtools view -L, I can identify reads that overlap these SNP/indel regions, but I haven't found how to check if the nucleotide matches the reference or matches the SNPs/indels in my list.
My SNPs and indels are tracks in CLC Bio and in a spreadsheet, so these can be formatted into whatever is needed pretty readily - .bed, .vcf, .gff, etc.
I have a SNP at position 82432 in my reference genome. The reference genome lists a 'T' here, but I want to know if any given read shows a 'C' in its place. How would I do that?