I have both DNA-seq and bisulfite-seq (BS-seq) data for several inbred genomes of a crop plant. I aligned the DNA to the reference genome and called high quality SNPs. I then took those SNPs and used them to modify the reference, I did this recursively a few times. I ended up with customized references for each inbred genome that were position identical but had divergent sequences. This was done in the hope of increasing alignment rates of BS-seq data and reducing false methylation calls.
The challenge comes with the hybrid of my inbreds. I have two genomes for each and had to align the same BS-seq reads to both. I now need to figure out which aligned reads likely come from each genome. But I am a little stumped as how to compare and then filter two bam files. I want to identify those reads that mapping better in one of the genomes and then remove it from the other's alignment. Are there any tools out there that my googling may have missed?
Does bamUtil diff do this? I can't tell from reading it.