Question: Flag In Sam Format
1
8.6 years ago by
Liyf290
Liyf290 wrote:

Hi, I am new in sequencing. I am confused about Flag in Sam format. I know 0 stands for mapping to forward strand and 16 stands for mapping to reverse strand. And 4 stands for unmapping. But what are other flag means? I really do not see any of them. I am a computer student, so I do not know much biology.

sam • 11k views
modified 4.7 years ago by -_-850 • written 8.6 years ago by Liyf290
9
8.6 years ago by
Yunfei Li310
ThermoFisher Scientific
Yunfei Li310 wrote:

You can find the explanation in sam format manual. To interpret it, there is a website can be helpful http://picard.sourceforge.net/explain-flags.html

This is a very useful website!

6
8.6 years ago by
brentp23k
Salt Lake City, UT
brentp23k wrote:

As you're a CS student, you understand they are bitwise flags? So they can be combined. So you can | (or) the flags that are powers of 2 to convey multiple pieces of information in the single number.

This python script:

``````def asbin(n):
"""converted a number to its binary rep (padded with 0's)"""
return str(bin(n))[2:].zfill(17)

print "value\thex\tbinary"
for pow in range(17):
val = 2 ** pow
print "%-5d\t%-4x\t%s" % (val, val, asbin(val))

# set all flags
all_ones = reduce(lambda x, y: x | 2**y, range(17), 1)
print "\nall flags set:", asbin(all_ones)
``````

Creates this output:

``````value   hex binary
1       1       00000000000000001
2       2       00000000000000010
4       4       00000000000000100
8       8       00000000000001000
16      10      00000000000010000
32      20      00000000000100000
64      40      00000000001000000
128     80      00000000010000000
256     100     00000000100000000
512     200     00000001000000000
1024    400     00000010000000000
2048    800     00000100000000000
4096    1000    00001000000000000
8192    2000    00010000000000000
16384   4000    00100000000000000
32768   8000    01000000000000000
65536   10000   10000000000000000

all flags set: 11111111111111111
``````
3
8.6 years ago by
Cambridge

Here are the meaning table of this flags.

``````0x1 template having multiple segments in sequencing
0x2 each segment properly aligned according to the aligner
0x4 segment unmapped
0x8 next segment in the template unmapped
0x10 SEQ being reverse complemented
0x20 SEQ of the next segment in the template being reversed
0x40 the ﬁrst segment in the template
0x80 the last segment in the template
0x100 secondary alignment
0x200 not passing quality controls
0x400 PCR or optical duplicate
``````

The numbers in second column of the SAM file hexadecimal numbers transformed to decimal scale.

For example, 16 in hexadecimal is 0x10 which it's means "SEQ being reverse complemented", as you already knew.

Cheers,

Thank you! I got sick these days so I am late. Haha.

But what it means by 0? I think it is mapped.

2
8.6 years ago by
Sean Davis26k
National Institutes of Health, Bethesda, MD
Sean Davis26k wrote:

Have a look at the SAM specification.

1
4.7 years ago by
-_-850
-_-850 wrote:

For fast and handy interpretation of the flag, try http://www.samformat.info/#/flag

0
8.6 years ago by
dli220
WUSTL
dli220 wrote:

this link http://genome.sph.umich.edu/wiki/SAM explains SAM format in detail.