What are bitsets in Samtools flagstat output?
1
1
Entering edit mode
8 months ago
kenneditodd ▴ 50

Hello,

I am am trying to use Samtools flagstat to analyze my BAM file after aligning nanopore dRNAseq reads to a reference transcriptome using minimap2. The output file indicates I have the following flagstats below (excluded zero values from output).

729779 + 0 in total (QC-passed reads + QC-failed reads)
617632 + 0 primary
111418 + 0 secondary
729 + 0 supplementary
199740 + 0 mapped (27.37% : N/A)
87593 + 0 primary mapped (14.18% : N/A)

I am trying to figure out the difference between "617632 + 0 primary" and "87593 + 0 primary mapped (14.18% : N/A)". Although, I realize my mapping percentage is ~14% from other QC tools.

From the samtools flagstat documentation it defines the following

primary - neither 0x100 nor 0x800 bit set
primary mapped - 0x4, 0x100 and 0x800 bits not set

Can someone please clarify what a bit set is and also the difference between primary and primary mapped? I guess I don't understand how all reads with a primary tag are not mapped.

samtools nanopore mapping • 1.1k views
ADD COMMENT
1
Entering edit mode

Not an answer, but I use this tranalator almost every time I play around with these flags. It should help you to gain an understanding of your bam, and I'd recommend going through your bam/sam and not just a summary.

https://broadinstitute.github.io/picard/explain-flags.html

ADD REPLY
0
Entering edit mode

Primary is only defined by the absence of those flags, so if a read is unmapped, then it can't be defined as secondary or supplementary. Then, by default, it is primary.

ADD REPLY
3
Entering edit mode
8 months ago
aw7 ▴ 270

All the SAM flags are stored in a bit field with each flag represented by a single bit. Bits set to 1 are on and bits set to 0 are off.

Using samtools flags with give you a list of all the flags.

0x1       1  PAIRED         paired-end / multiple-segment sequencing technology
0x2       2  PROPER_PAIR    each segment properly aligned according to aligner
0x4       4  UNMAP          segment unmapped
0x8       8  MUNMAP         next segment in the template unmapped
0x10     16  REVERSE        SEQ is reverse complemented
0x20     32  MREVERSE       SEQ of next segment in template is rev.complemented
0x40     64  READ1          the first segment in the template
0x80    128  READ2          the last segment in the template
0x100   256  SECONDARY      secondary alignment
0x200   512  QCFAIL         not passing quality controls or other filters
0x400  1024  DUP            PCR or optical duplicate
0x800  2048  SUPPLEMENTARY  supplementary alignmentent

There is no primary flag. All reads are presumed to be primary unless another flag indicates that it is not. The flags that make a read not primary are (from the documentation) either 0x100 (SECONDARY) or 0x800 (SUPPLEMENTARY). Also, all reads are presumed to be mapped unless they are marked with the UNMAP flag.

We should make the documentation on flagstats clearer.

You can use samtools flags to tell you what a flag value means. So samtools flags 0x800 gives

0x800   2048    SUPPLEMENTARY

For multiple flags, add them together samtools flags 0x904 (0x4 + 0x100 + 0x800) gives

0x904   2308    UNMAP,SECONDARY,SUPPLEMENTARY

You can do the reverse samtools flags UNMAP,SECONDARY,SUPPLEMENTARY which also gives

0x904   2308    UNMAP,SECONDARY,SUPPLEMENTARY
ADD COMMENT
0
Entering edit mode

Thank you! This was super helpful.

ADD REPLY

Login before adding your answer.

Traffic: 1886 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6