Unmapped paired end reads with confusing sam flag values
2
0
Entering edit mode
8.2 years ago

I have two reads (paired end) in my bam file generated by bowtie2.

K00194:35:H7JHYBBXX:1:1101:10653:1349   77      *       0       0       *       *       0       0    ACTGAATT  JJJJJJJJ  YT:Z:UP

K00194:35:H7JHYBBXX:1:1101:10653:1349   141     *       0       0       *       *       0       0       GATTCGCC  JJJJJJJJ YT:Z:UP

I am confused about the flag values. The first flag score (77) makes sense (0x1 = 1 = multiple segments, 0x4 = 4 = unmapped, 0x8 = 8 = next segment unmapped, 0x40 = 64 = first segment in template. The next score (141) I find confusing (0x1 = 1= multiple segments, 0x4 = 4 = unmapped, 0x8 = 8 = next segment unmapped, 0x80 = 128 = last segment in template). The confusion is the last two bits (0x8 and 0x80).

How can you have the next segment unmapped when you are the _last_ segment in the template? I would think that the value should be 133 instead?

sam RNA-Seq • 3.9k views
ADD COMMENT
1
Entering edit mode
8.2 years ago
John 13k

The "next" value for the last read/segment always wraps around in the SAM format. It's true also for flags 0x8 and 0x20, as well as RNEXT and PNEXT.

PNEXT: Position of the primary alignment of the NEXT read in the template. Set as 0 when the information is unavailable. This field equals POS at the primary line of the next read.

ADD COMMENT
0
Entering edit mode
8.2 years ago

The flags say that the reads are paired, and that both are unmapped, and that one is read 1, and one is read 2. The "8" column refers to the mate, not the "next" segment.

https://broadinstitute.github.io/picard/explain-flags.html

ADD COMMENT
0
Entering edit mode

That is how Picard defines flags and not strictly the SAM spec.

ADD REPLY

Login before adding your answer.

Traffic: 2187 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6