Flag In Sam Format
6
1
Entering edit mode
12.8 years ago
Liyf ▴ 300

Hi, I am new in sequencing. I am confused about Flag in Sam format. I know 0 stands for mapping to forward strand and 16 stands for mapping to reverse strand. And 4 stands for unmapping. But what are other flag means? I really do not see any of them. I am a computer student, so I do not know much biology.

sam • 17k views
9
Entering edit mode
12.8 years ago
Yunfei Li ▴ 310

You can find the explanation in sam format manual. To interpret it, there is a website can be helpful http://picard.sourceforge.net/explain-flags.html

0
Entering edit mode

This is a very useful website!

6
Entering edit mode
12.8 years ago
brentp 24k

As you're a CS student, you understand they are bitwise flags? So they can be combined. So you can | (or) the flags that are powers of 2 to convey multiple pieces of information in the single number.

This python script:

def asbin(n):
"""converted a number to its binary rep (padded with 0's)"""
return str(bin(n))[2:].zfill(17)

print "value\thex\tbinary"
for pow in range(17):
val = 2 ** pow
print "%-5d\t%-4x\t%s" % (val, val, asbin(val))

# set all flags
all_ones = reduce(lambda x, y: x | 2**y, range(17), 1)
print "\nall flags set:", asbin(all_ones)


Creates this output:

value   hex binary
1       1       00000000000000001
2       2       00000000000000010
4       4       00000000000000100
8       8       00000000000001000
16      10      00000000000010000
32      20      00000000000100000
64      40      00000000001000000
128     80      00000000010000000
256     100     00000000100000000
512     200     00000001000000000
1024    400     00000010000000000
2048    800     00000100000000000
4096    1000    00001000000000000
8192    2000    00010000000000000
16384   4000    00100000000000000
32768   8000    01000000000000000
65536   10000   10000000000000000

all flags set: 11111111111111111

3
Entering edit mode
12.8 years ago

Here are the meaning table of this flags.

0x1 template having multiple segments in sequencing
0x2 each segment properly aligned according to the aligner
0x4 segment unmapped
0x8 next segment in the template unmapped
0x10 SEQ being reverse complemented
0x20 SEQ of the next segment in the template being reversed
0x40 the ﬁrst segment in the template
0x80 the last segment in the template
0x100 secondary alignment
0x200 not passing quality controls
0x400 PCR or optical duplicate


The numbers in second column of the SAM file hexadecimal numbers transformed to decimal scale.

For example, 16 in hexadecimal is 0x10 which it's means "SEQ being reverse complemented", as you already knew.

Cheers,

0
Entering edit mode

Thank you! I got sick these days so I am late. Haha.

0
Entering edit mode

But what it means by 0? I think it is mapped.

2
Entering edit mode
12.8 years ago

Have a look at the SAM specification.

1
Entering edit mode
8.8 years ago
-_- ★ 1.1k

For fast and handy interpretation of the flag, try http://www.samformat.info/#/flag

0
Entering edit mode
12.8 years ago
dli ▴ 250

this link http://genome.sph.umich.edu/wiki/SAM explains SAM format in detail.