Distinguishing records for reads in a pair in a SAM file
2
0
Entering edit mode
7 weeks ago
appropiate ▴ 60

In an aligned SAM file, if the two reads of a pair align to multiple different positions in the reference genome, then I will be visualizing multiple records when selecting this pair by its name in the terminal. How can I know just by looking at the records headers, but ignoring their respective flags, which records belong to the same read?

SAM sequencing • 335 views
1
Entering edit mode
6 weeks ago
d-cameron ★ 2.7k

The SAM file format specifications states:

Multiple mapping The correct placement of a read may be ambiguous, e.g., due to repeats. In this case,
there may be multiple read alignments for the same read. One of these alignments is considered
primary. All the other alignments have the secondary alignment flag set in the SAM records that
represent them. All the SAM records have the same QNAME and the same values for 0x40 and 0x80
flags. Typically the alignment designated primary is the best alignment, but the decision may be
arbitrary.


That is, reads from the same template (i.e. both R1 and R2) should have the same QNAME (column 1).

TDLR: column 1 (read name) will be the same.

0
Entering edit mode

Thanks @d-cameron, yes, multiple alignments for R1 and R2 of the same template have the same name, and I only know how to distinguish if the aligments belong to R1 or R2 through their flags, which I have to check individually at https://broadinstitute.github.io/picard/explain-flags.html. I just wondered if there was another way to know this by checking the records headers other than through their flags to be able to distinguish them faster.

1
Entering edit mode
6 weeks ago

If I understand the question correctly, you want to know which of the multiple alignments are from the first-in-pair and which are from the second-in-pair. For that, you do need to look at the flag column and check for the presence of 64 or 128. Read name alone is not enough.

0
Entering edit mode

Thanks a lot @dariober, you did understand correctly (I could have explained myself better...), just wanted to distinguish between first- and second-in-pair alignments at a glance at the terminal because I only know how to do this by checking their flags individually at https://broadinstitute.github.io/picard/explain-flags.html, which is sort of tedious.