Question: In Sam File Flag Information, How Could Unmapped Read Have A Direction?
0
gravatar for Chen
6.1 years ago by
Chen970
Chen970 wrote:

In SAM file I get, there are some flag value that dose not make sense:

101  1+4+32+64  means first read, unmapped with direction+ ; mate read mapped with direction -
117  1+4+16+32+64  means this is the fist read, unmapped but with direction -; mate read mapped with direction +
153  1+8+16+128  means this is the second read, mapped with direction -; mate read unmapped with direction +
185  1+8+16+32+128 means this is the second read, mapped with direction -; mate read unmapped with direction -

My question is that if the read or mate-read is unmapped, how could SAM file also report its direction? What is the meaning of unmapped read direction, what is the difference between unmapped read with direction + and -.

sam mapping • 2.5k views
ADD COMMENTlink modified 6.1 years ago by Ashutosh Pandey12k • written 6.1 years ago by Chen970
1
gravatar for Ashutosh Pandey
6.1 years ago by
Philadelphia
Ashutosh Pandey12k wrote:

101 means that this read is 1) paired and first in pair 2) It is unmapped but its mate is mapped on the reverse strand. The information about the direction is only available for the read that is mapped.

Flags 101 and 153 are possible but 117 and 185 are not possible. These flags are available but that doesn't mean that they are applicable or can be used by aligners. For example, 185 says that mate is unmapped and mate is mapped to reverse strand.

Now coming to the concept of direction:

We know that DNA is double stranded. Paired-end reads are generated in a way so that the reads in a given pair come from opposite strands. So for any given read, the aligner will try to align it to both the forward and reverse complementary strand of the genome (Note that reference genome fasta file only represents the forward strand) because aligner doesnt know which strand the read has originated (forward or reverse). For case 101, the first read in the pair didn't map. But the second read mapped to the reverse complementary strand of the reference genome fasta file. Hence, the flag says thats the read is mapped on the reverse strand.

ADD COMMENTlink modified 6.1 years ago • written 6.1 years ago by Ashutosh Pandey12k
1

Small correction, the information about direction is only meaningful for the mapped read. It can still be set for an unmapped read, but should just be ignored. Always check 0x4 or 0x8 and ignore the remainder accordingly.

ADD REPLYlink written 6.1 years ago by Devon Ryan94k

Thanks, that make sense.

ADD REPLYlink written 6.1 years ago by Chen970
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1776 users visited in the last hour