I am aligning nanopore reads to the C. Elegans genome to identify coverage across the genome.
There is a region in the C. Elegans genome which has a very high number of reads matching (an order of magnitude higher than others). I think this is because its a repeat region and has lots of homopolymers. So reads from this region have a lot of errors and their alignment here is ambiguous. As a result a single read blasted to this annoying region ends up with multiple hits because blast can't figure out the best alignment.
Can you suggest any strategies to work around this? My current thought is to prevent BLAST finding multiple hits for a single read in the same region. Is this a good strategy and what is the best way to implement this?
Thanks for your time.