Let's assume the following reference sequence:
The individual has a deletion of 1 T in that stretch of 20 T's. Could be the left-most, the middle, the right-most. We don't know.
Let's assume I have sequenced 100 reads which cover that region. So the mapper has to make a choice for every single one of the 100 reads: which T is deleted? That means that in an unfortunate distribution I might never see the deleted T?
Let's assume the mapper chooses the first T to be deleted for reads 1 to 5. For the other 95 reads it chooses some other position. That means later during the variant valling, the caller will assume "Nothing special here. I see 95 reads which say there's a T and I see 5 reads which say there's nothing. Probably just a T like the reference".
Next position, the second T: Let's assume for read 6 to 10, the mapper has decided that the deleted T is the second T. For the other 95 reads it chooses some other position. Again, the caller will say "Alright I got 95 times a T and 5 times nothing. Probably just a T"
In reads 11 to 15, the 3rd T is deleted. In reads 16 to 20 the 4th T is deleted ... [...] ... and in reads 95 to 100 the 20th T is deleted. So if the mapper chooses the deleted T for every read and the distribution is unfortunate, I will never call that deleted T. If the mapper had instead chosen the first T to be the deleted T for all of the 100 reads, I would have seen the deletion. How is this problem avoided?
I know that a mapper is just an algorithm, so it will probably never come to the situation described above (unless the programmer decided "multiple options for the position of the T deletion? hey, just pick one randomly" -> but for the reason described above that would be a mistake right?). Or is the opposite eventually true: is it 100% certain that a mapper will choose the same T to be the deleted T for ALL (or just some? 90%? 80%? what's the case?) of the 100 reads? That would mean, I would never miss the deletion and later I can left-align, right-align or do whatever with the variant.