8.2 years ago by
That is not always the whole story..."proper pair" can also mean that the reads are correctly oriented with respect to one another, i.e. that one of the mate pairs maps to the forward strand and the other maps to the reverse strand. If the mates don't map in a proper pair, that may mean that both reads map to the forward or reverse strand.
I know this holds in MAQ and Stampy, but it doesn't seem to hold for BWA. In BWA, after a quick peek at some of my data, the reads that I have that aren't in a proper pair have mates that map to different chromosomes, similar to what brentp suggested.
If you want to check for whatever aligner you're using, you can parse the flags of some reads that have the 0x0002 flag. If they have both the 0x0010 and 0x0020 flags set to 1 or 0, that would be why they aren't in a "proper pair".
Bottom line: if something is mapping in an improper pair, I'd be suspicious about it having mapped correctly and probably exclude it.
There are some slides here from a MAQ presentation that describe proper pairs, at least for that aligner.