Bowtie2 local alignment SAM file SEQ output - reverse complement or not ?
2
0
Entering edit mode
9.2 years ago
Roger • 0

Hi all,

According to the Bowtie2 manual, Bowtie2 writes the reverse complement sequence of the original read into the SAM file output if the read maps to the reverse complement of the reference (--> negative strand, FLAG = 16).

Cited from the manual:

  1. Read sequence (reverse-complemented if aligned to the reverse strand)

What happens in the local alignment mode? I can't find a description for that. Is the output also reverse-complement? For the whole read sequence? Or only for the mapped part?

From my data I would say it is not reverse complement.

Thank you for your help/experience.

Strandness Bowtie2 Strand-specificity SEQ SAM • 6.4k views
ADD COMMENT
5
Entering edit mode
9.2 years ago

If a read aligns to the "-" strand, its sequence will be reverse complemented in a SAM file. This is regardless of aligner or settings. Surprisingly, this is only implied by the current SAM specification, though I would expect that oversight to change in the future.

Edit: I should expand on that a bit. The entire portion of the read that's described in the alignment is reverse complemented. So if a region is soft-clipped then that's included here. If a region is hard-clipped then it will be absent (note that bowtie2 doesn't do hard clipping). If a region is aligned as part of a secondary/non-linear/chimeric alignment, then it will also be excluded from exclusion in the sequence (bowtie2 doesn't produce these either, but others will). You should never have to worry about only a portion of the presented sequence being reverse complemented (that would cause headaches to no end!).

ADD COMMENT
1
Entering edit mode

Damn!! You beat me again :-)

ADD REPLY
1
Entering edit mode

I just had a coffee :)

ADD REPLY
2
Entering edit mode
9.2 years ago

All mapped reads in the SAM/BAM format are represented on the forward strand of the reference genome. In other words, if a read has been mapped to the reverse strand, then the reverse compliment of that read will be used to represent it in the BAM file. Direction of other strings such as CIGAR and base quality scores will be changed accordingly. The fact that the read was mapped to the reverse strand can now be determined using column 2 or the bitwise FLAG column.

Now coming to your question about the local alignment. This is interesting. Local alignments can be represented in the BAM file using soft or hard clipping. Soft clipped bases will appear in column 10 or SEQ column and hard clipped bases won't appear. Now, my guess is that if a read got aligned to the reverse strand and soft clipping was used to align it then all the bases in the SEQ string should be reverse complemented.

ADD COMMENT

Login before adding your answer.

Traffic: 2616 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6