I have a bunch of WGS sequences for a variety of cultivars at a decent coverage (20X). I am interested in a specific region of the genome that I suspect as undergo rearrangement in resistant cultivars. I have alignment data to a reference that does not have the rearrangement. I would like to reconstruct that region and compare it to non resistant cultivars. How would you do that ?
I can perform a de novo assembly on the whole datasets for every cultivars and look for the region of interest (via blast I guess) to see if any contigs match the region. This seems like a bit overkill.
I would like to select only the reads of the region of interest but not sure how to since the reference does not have this rearrangement. Have the reads describing it been thrown away during the mapping, not sure... Could I use the region that is flanking and conserved as a core and then extend this region by capturing overlapping reads ? Not sure how to do it, any suggestions ?