Suppose I have 2 reads, and I align them as a part of a joining program or an assembly:
Then I output one combined read:
How do I combine the associated (phred scaled) quality scores in the overlapped region?
If the two bases match, and say one is q20 and the other q20 is then the likelihood of correctness is probably higher than q20.
It's clear that if the two bases don't match, and say one is q20 and the other q20 is then the likelihood of correctness is 50% (phred quality score of 3).
If the two bases don't match, and one is q40 and the other q3, and I choose the q40 base, then the correctness likelihood is probably still really near around q40. That q3 base shouldn't pull down the q40 that much.
Is there a "correct" formula for:
- combining quality scores when bases match in an overlapped region
- combining quality scores when don't match in an overlapped region