Question: Samtools: Question about filtering BAM file using flag
0
gravatar for SDin
5 weeks ago by
SDin0
Tsinghua University
SDin0 wrote:

Hi there, I am trying to filter BAM file with their 'flag' column. I am a little confuse about the meanings of, for example,

samtools view -f 4 -F 264 .....

I check the documentation, and notice

-f means 'what I want' and   '4'    means 'read unmapped'
-F means  'wipe off'   and  '264'   means 'mate unmapped + not primary alignment'

My question is:

If a sequence's flag is 12, will it be extracted by '-f 4'? If a sequence's flag is 8, will it be wiped off by '-F 264'?

I am confuse about the mechanism of this code, which is unfortunately not clear in the documentation.

Many thanks.

sequencing alignment next-gen • 185 views
ADD COMMENTlink modified 5 weeks ago by Friederike3.3k • written 5 weeks ago by SDin0
1

See SAM Format site for explanation of the flags.

ADD REPLYlink written 5 weeks ago by genomax64k
1

Also see Bitwise Flag Explained

ADD REPLYlink written 5 weeks ago by ATpoint14k
6
gravatar for i.sudbery
5 weeks ago by
i.sudbery4.1k
Sheffield, UK
i.sudbery4.1k wrote:

The flags are numbers in base-2. Thus a better way to think about the flags is their binary encoding.

The binary encodings of the values you mentioned are:

  4: 0 0 0 0 0 0 1 0 0
  8: 0 0 0 0 0 1 0 0 0
 12: 0 0 0 0 0 1 1 0 0 
264: 1 0 0 0 0 1 0 0 0

Here we can easily see that 12 is composed of 8 (mate unmapped) and 4 (read unmapped), while 264 is composed of 256 (not primary alignment) and 8 (mate unmapped).

As -f means retain only reads with all specified flags set, a read with the flag 12 will be retained by -f 4 because a read with flag 12 has its 4 flag set. As -F means retain only reads with none of the specified flags set, a read with flag 8 will be removed by -F 264 because 8 is one of the flags specified by 264.

ADD COMMENTlink written 5 weeks ago by i.sudbery4.1k

of course -f 4 -F 264 will exclude a read with flag 12 because means that the 8 flag is set, and the eight flag is one of the flags that -F 264 instructs samtools to exclude.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by i.sudbery4.1k

I got it. Many thanks for your kind help!

ADD REPLYlink written 5 weeks ago by SDin0
1
gravatar for Friederike
5 weeks ago by
Friederike3.3k
United States
Friederike3.3k wrote:

The flag values are always precise and unique because they are numerical representation of the _sum_ of the (numerical) answers to the 12 questions such as "Is the respective read mapped? Paired? ..."

See i.sudbery's answer for the technical details and perhaps play around with this page to get a better feeling for what's going on.

ADD COMMENTlink modified 5 weeks ago • written 5 weeks ago by Friederike3.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1400 users visited in the last hour