Question: bowtie2 - Can anybody explain the mismatches in the alignment if no mismatches are allowed in the seed?
0
gravatar for shim
2.8 years ago by
shim10
shim10 wrote:

Hi,

I'm doing an alignment to multiple relatively short targets that are quite similar. I'm using bowtie2 for its local-alignment capabilities (and since I'm used to it).

I read the manual + some of the paper + suppl, and am still not quite sure I understand how the alignment procedure works.

I understand the reads (+its reverse-complement ) are extracted into seeds, in my case (the default) extract 20nt seeds every 10nt (my reads are paired-end <=150nt, after adapters cleaning). The seeds are than aligned and are then prioritized+combined - this last part I don't quite understand (lack of info.).

I would like to understand the following:

  1. From seeds alignment => read alignment. What is the region of the read that is aligned? what happens if a seed is in between seeds that have a mismatch (MM).

  2. Explanation for mismatches- by default no MM is allowed in a seed and there are overlap between seeds. my current possible explanation: at least 2 adjacent seeds are not aligned (have >0 MM) but the seeds surrounding them do align. [explanation (ignoring reverse-complement): seeds are at 1-20 seed1, 11-30 seed2, 21-40 seed3, 31-50 seed4 if seed2,seed3 are not aligned there is a gap 21-30 that can have a MM].

  3. is there a mode where I can see the actual seeds alignments / get more in-depth explanation for each alignment.

  4. is there a better alignment software that I can switch into easily (having SAM output and is not too slow). At this stage local-alignment is not crucial but easy from bowtie2 is.

many thanks!

ADD COMMENTlink modified 2.7 years ago by Biostar ♦♦ 20 • written 2.8 years ago by shim10

Answer to 1) As far as I know the seed/read alignment process works like this (I put it simple, because I understand it only in simple way!) a) You first select a portion of the read (the seed). It can be long 10-20 nt, usually it is a parameter. The seed theoretically can be any par tof the read, I do not know if the praxis is to start form the middle, the beginning or the end or whatever. b) You try to align the seed to the reference with at most N mismatches. This N is a parameter that can be usually be specified (in bowtie it is). c) If you are able to align the seed, you then EXTEND the alignment until you are able to align the whole read with at most P mismatches. Usually, P>=N. So you can have 0 mismatches in the seed, but more than 0 in the read.

Answer to 2) I am sorry, I really do not understand the question!

Answer to 3) I don't know.

Answer to 4) STAR is recommended instead of Bowtie

ADD REPLYlink modified 2.7 years ago • written 2.7 years ago by Fabio Marroni2.5k

Thanks for your answer. I do not exactly understand how you EXTEND the alignment. I do assume it is by "merging" the successful adjacent aligned seeds (more efficient than trying to extend every aligned seed). So I guess the only explanation of mismatches in the read (P>0; N==0 is default in bowtie2 so if #MM>0 the seed is not aligned) when there is an overlap between seeds is that some seeds are not aligned (in my case of 20nt every 10nt so even multiple adjacent seeds are not).

I will check STAR

ADD REPLYlink modified 2.7 years ago • written 2.7 years ago by shim10

I've already looked at both bowtie2 manual + paper + supplementary. I might ask the authors

ADD REPLYlink written 2.7 years ago by shim10

Indeed, I deleted my comment after realizing that you already mentioned that you read the paper and manual!

ADD REPLYlink written 2.7 years ago by Fabio Marroni2.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 988 users visited in the last hour