Question: Bowtie2 local alignment SAM file SEQ output - reverse complement or not ?
gravatar for Roger
3.5 years ago by
Roger0 wrote:

Hi all, 

According to the Bowtie2 manual, Bowtie2 writes the reverse complement sequence of the original read into the SAM file output if the read maps to the reverse complement of the reference (--> negative strand, FLAG = 16). 

Cited from the manual:

10. Read sequence (reverse-complemented if aligned to the reverse strand)

What happens in the local alignment mode ? I can't find a description for that. Is the output also reverse-complement ? For the whole read sequence ? Or only for the mapped part ?

From my data I would say it is not reverse complement. 

Thank you for your help/experience. 

ADD COMMENTlink modified 3.5 years ago by Istvan Albert ♦♦ 77k • written 3.5 years ago by Roger0
gravatar for Devon Ryan
3.5 years ago by
Devon Ryan82k
Freiburg, Germany
Devon Ryan82k wrote:

If a read aligns to the "-" strand, its sequence will be reverse complemented in a SAM file. This is regardless of aligner or settings. Surprisingly, this is only implied by the current SAM specification, though I would expect that oversight to change in the future.

Edit: I should expand on that a bit. The entire portion of the read that's described in the alignment is reverse complemented. So if a region is soft-clipped then that's included here. If a region is hard-clipped then it will be absent (note that bowtie2 doesn't do hard clipping). If a region is aligned as part of a secondary/non-linear/chimeric alignment, then it will also be excluded from exclusion in the sequence (bowtie2 doesn't produce these either, but others will). You should never have to worry about only a portion of the presented sequence being reverse complemented (that would cause headaches to no end!).

ADD COMMENTlink modified 3.5 years ago • written 3.5 years ago by Devon Ryan82k

Damn!! You beat me again :-)

ADD REPLYlink written 3.5 years ago by Ashutosh Pandey11k

I just had a coffee :)

ADD REPLYlink written 3.5 years ago by Devon Ryan82k
gravatar for Ashutosh Pandey
3.5 years ago by
Ashutosh Pandey11k wrote:

All mapped reads in the SAM/BAM format are represented on the forward strand of the reference genome. In other words, if a read has been mapped to the reverse strand, then the reverse compliment of that read will be used to represent it in the BAM file. Direction of other strings such as CIGAR and base quality scores will be changed accordingly. The fact that the read was mapped to the reverse strand can now be determined using column 2 or the bitwise FLAG column. 

Now coming to your question about the local alignment. This is interesting. Local alignments can be represented in the BAM file using soft or hard clipping. Soft clipped bases will appear in column 10 or SEQ column and hard clipped bases won't appear. Now, my guess is that if a read got aligned to the reverse strand and soft clipping was used to align it then all the bases in the SEQ string should be reverse complemented. 

ADD COMMENTlink written 3.5 years ago by Ashutosh Pandey11k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 664 users visited in the last hour