SAM FLAG for primary alignments, secondary alignments and what's their relations to uniqueness of mapping
1
4
Entering edit mode
7.7 years ago
epigene ▴ 590

Can someone comment on what is the correct FLAG for primary alignments and secondary alignment?

Does secondary alignments mean the same thing as not primary alignments?

for SE:

Primary alignments include reads with FLAG of 0 and 16 (forward and reverse strand, respectively)

Secondary alignments include reads with FLAG of 256 and 272 (forward and reverse strand, respectively)

for PE:

primary alignments include reads with FLAG of 99+147 or 83+163? Not really sure about PE FLAG numbers.

Not sure about secondary alignments.

Does being primary alignments indicate their mapping uniqueness? I guess not.. but is there some kinda relations between being primary alignments and being uniquely mapped?

SAM • 34k views
ADD COMMENT
25
Entering edit mode
7.7 years ago
Jeffin Rockey ★ 1.3k

What we have in SAM spec:

  • 256 for secondary alignment
  • 4 for unmapped
  • Multiple mapping :One of these alignments is considered primary. All the other alignments have the secondary alignment flag set in the SAM records that represent them.
  • NH i Number of reported alignments that contains the query in the current record

Keeping the above in mind, if the aligned bam file is from tophat or STAR (unmapped not included)

samtools view -F 256 should keep out secondary giving primary aligned only.

On the other hand if the bam is from bowtie2 or bwa or so (having unmapped included in the same bam) We need to use flag 4 as well (256 + 4 ->260).Hence

samtools view -F 260 would be useful in that case

Now as asked, there should indeed be a connection between primary and uniquely aligned owing to the fact that, uniquely aligned reads will have one primary alignment only and no secondary alignments. But, I doubt there is any FLAG per se that could fetch the uniquely aligned directly.

Instead we have to rather rely on mapping quality and NH tag but there is indeed a problem here,

Though the specification gives a MAPQ field, it do not specify any particular value for uniquely mapped.In other words, the quality value for uniquely mapped is dependent on the aligner used. For example, STAR specifies a 255 value for uniquely mapped. Another option is the NH field specification mentioned at the top from SAM spec. Accordingly NH:i:1 should indicate a uniquely mapped.

But in the certain cases where the quality value used for unique alignments is not clearly specified and the NH field also is not used, the flawless indicator of unique mapping is something I am searching for months and yet to get an answer.

I look forward to better answers (and corrections if any) touching the supplementary alignment flag as well if relevant.


Jf

ADD COMMENT
0
Entering edit mode

thanks for a good answer. I just have a few comments. Does 256 for secondary alignment apply to both SE and PE reads? Also do you know how to find out if a particular aligner output unmapped reads in bam besides counting the FLAG on the bam file?

ADD REPLY
0
Entering edit mode

Have been working with PE data only and 256 flag was quite fine (The flag is very useful to find out the correct alignment percentages when multi-mapping is allowed). And I cannot think of any reason why there should any trouble with SE.

Regarding the second part of the question, a quick rush through the manual should do good. And if it is first time with a new aligner, I rather give a check with flag 4 itself reason being, sometimes we miss the subtle but important details in the manual in the rush. What I know is below,

Tophat and STAR -> separate mapped and unmapped.

bowtie2 and bwa -> mapped and unmapped together.


Jf

ADD REPLY
0
Entering edit mode

I can add one:

hisat2 -> mapped and unmapped together

From my experience, STAR and hisat2 can generate NH:i: field. Do you know if tophat and bowtie2 can generate NH tag? From what I can find, they don't seem to do so.

In STAR, you can change the default value of 255 for uniquely mapped alignments to a value you specify with --outSAMmapqUnique

ADD REPLY
0
Entering edit mode

Not sure of bowtie2.But tophat does give NH.

ADD REPLY
0
Entering edit mode

bowtie2 has XS and AS tags that report primary score (AS) and secondary score (XS). If they are similar it means that there are several (at least 2) similar alignments (similar/same aligning score in two different locations). You can use them to extract the uniquely aligning reads where unique means: aligns only in one location with best score but could align somewhere else with a worse score.

XS tag is absent in reads who have a single alignment.

ADD REPLY
0
Entering edit mode

Today I came across a blog from Simon Andrews regarding MAPQ scores from different aligners which is quite useful in the context of unique alignment. Link

ADD REPLY

Login before adding your answer.

Traffic: 3045 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6