How BWA determine a paired-end read is proper or improperly paired?
2.5 years ago
godth13teen ▴ 70

Hi, I recently stumble upon some issue related to improperly paired reads. From my understanding, they are marked with SAM Flag 0x2, and they need to suffice those criteria:

• 2 paired reads need to be on the same chromosome.
• 2 paired reads don't map 'too far' from each other. I can understand the first criteria, but how's about the second one? When are 2 reads 'close enough' to be considered as 'properly' paired?

I have searched for the explanation of this problem, but I haven't found one that specified about BWA yet. I also checked BWA man, they didn't mention about the metrics, either.

Thank you very much

it uses the outliers from the fragment-length distribution for a set of reads.

[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (100, 120, 148)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (4, 244)
[M::mem_pestat] mean and std.dev: (125.80, 35.21)

so if the insert size of one pair is out of [mean - std.ev, mean + std.ev] then it's considered as an improper pair, am I right?