Hello, I am new to this and I have a couple of questions.
Firstly, is there a way to extract the particular location of mutation in the context of the reference (where the position 3 of the mutation also refers to a position 3 within a reference and not to the position 3 from the sequence alignment)? I am interested to see if for example one particular mutation is only present if other mutation is there so I need to be able to see that they came from the same read.
Secondly, I am not sure about something in the sam file. Here for example:
gi|11111111|ref|TT|_152_620_1:0:0_3:0:0_0 163 gi|11111113|ref|TL| 152 42 70M = 551 469
The 152 refers to the left most read. The 551 refers to the left most read of the paired sequence right? And the 469 is the length of the read? But why can I only see the 70 bp (70M)? That gives me only 140 bp. Thats strange. Someone else created a SAM file with those sequences previously and they had longer reads, not just 70M.
Am I forgetting something?
Thank you very much for your help!