Positions in Sam file
2
0
Entering edit mode
3 months ago
pooryamb • 0

Dear All,

I have used bowtie2 with --local option for locally aligning many sequences against one sequence. I want to find the position of the start of alignment in the query and the target. I have a hard time finding this out from the SAM output. Could you please help me find it?

Every line of my SAM file is as follows: "M04141:149:000000000-K3JVJ:1:1101:12608:2943 0 rightSide 319 22 140S46M * 0 0 TTATATTTTTTTTTGACAAGCCTTCCTATTATTCTTTTATATATAAATTGATTAAAACTATTATAAATAAAATAAAATAAAAAATTAATAAAAATATTAAAAAATAAAAATAAATTAATATATAAAAAATAAATTATTTATATTTTGGTTTTATAAAATGTTTTTTCTATGTCTTGTGTGCTTAAG IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:92 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:46 YT:Z:UU"

Best wishes,

SAM alignement_positions • 355 views
ADD COMMENT
1
Entering edit mode
3 months ago
seidel 9.4k

You should read the SAM specification for an explanation of the fields in the SAM file. In the information above your target (field 3, or Reference Sequence) is called "rightSide", and the number before that (field 2) is the bitwise FLAG describing the alignment. Since it is 0, it is telling you that your sequence successfully maps to the forward strand of your reference. However, the CIGAR string is telling you that your query sequence is Soft clipped for 140 bases, before alignment starts. After clipping, your sequence should begin alignment at base 319 of your reference.

It also helps to try some toy examples with known sequences, so you can see how the values change as you align them. You can take a few bases from "rightSide" and give them to bowtie2 on the command line using the -c parameter (e.g. -c ATTTATATTTTGGTTTTATAAAATGTTTTTTCTATGT). Change a few bases and see how the SAM output changes, take the reverse complement, etc.

ADD COMMENT
0
Entering edit mode

Thank you!

ADD REPLY
0
Entering edit mode
3 months ago
pooryamb • 0

In case anyone else had the same question and was looking for an easy way of finding the start and end of the alignment. I just found a python script for converting sam format to .psl. Psl is the output format of blat, and it specifies query start, query end, target start, and target end in different columns. https://github.com/ndaniel/fusioncatcher/blob/master/bin/sam2psl.py

ADD COMMENT

Login before adding your answer.

Traffic: 2417 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6