3
2
Entering edit mode
4.8 years ago

Hello

I have a few of questions while studying tool MARKDUPLICATES metrics file.

A list of SECONDARY_OR_SUPPLEMENTARY_RDS is there.

Thanks in advance for full explanation.

5
Entering edit mode
4.8 years ago
d-cameron ★ 2.8k

Section 1.2 "Terminologies and Concepts" of the SAM file format specifications explain this:

Chimeric alignment

An alignment of a read that cannot be represented as a linear alignment. A chimeric alignment is represented as a set of linear alignments that do not have large overlaps. Typically, one of the linear alignments in a chimeric alignment is considered the “representative” alignment, and the others are called “supplementary” and are distinguished by the supplementary alignment flag. All the SAM records in a chimeric alignment have the same QNAME and the same values for 0x40 and 0x80 flags (see Section 1.4). The decision regarding which linear alignment is representative is arbitrary.

8
Entering edit mode
4.8 years ago

With all my paint skills, here is a representation of the @d-cameron answer

In green it's your read. The red circle is your "supplementary" alignment. "Representative" alignment is the arrow fragment

1
Entering edit mode
7 weeks ago
cmdcolin ★ 2.7k

I wrote a blogpost discussing some topics related to supplementary reads. Especially with long reads, supplementary reads are often "split alignments". Part of the read maps to one place in the genome, and part of the read maps to another place (say, long ways away or on an entirely different chromosome, or in an opposite orientation in the case of an inversion for example)

My blogpost is here https://cmdcolin.github.io/posts/2022-02-06-sv-sam

1
Entering edit mode

one concept the "linearity" of the alignment is important. A chimeric alignment is found to be "chimeric" if the entire read is represented in the supplementary alignment. Though evidently, small indels in each alignment should be ok.

But then, I think longer indels could interfere with the linearity requirement, and once that length is exceeded the aligner will not generate the supplementary alignments. I don't have a good sense of how long the indel would be for the aligner to not produce the supplementary alignments - and it probably depends on the algorithm and other settings.

1
Entering edit mode

Large insertions or deletions can cause supplementary alignments to be created, even though it is still linear. But, minimap2 will try to align through quite large insertions and deletions, up to 100kb, see https://arxiv.org/abs/2108.03515

I made a little mspaint figure for fun too :)