bowtie -n aligment mode
1
0
Entering edit mode
3.5 years ago
valdirbarth ▴ 20

Hi all,

I am kind of new to bioinformatics and I am trying to understand why I have been getting extremely different results when I trim my sequences to 20 bp vs when I use seed length of 20 on bowtie. (sorry if it is a stupid question, but I need to know)

My reads are 75 bp. If I remove adapter sequences and etc and run bowtie using -n 0 -l 20, I get only about 45% of reads aligning to my bacterial genome. If I trim them all down to 20 bp and do the same thing, I get 75% of reads aligned. I thought by limiting the seed to 20 bp, only the first 20 bp would be considered for the alignment with 0 mismatches. Shouldn't that give me a similar result to the trimmed ones? Or is the whole 75 bp considered despite the seed length?

Thank you all for the help!

RNA-Seq bowtie • 1.3k views
ADD COMMENT
1
Entering edit mode

the seed length is just a 'seed' or starting point for the alignment.

ADD REPLY
1
Entering edit mode
3.5 years ago
glihm ▴ 620

Hello valdirbarth,

You have the good understanding of the -n and -l options. However, this is not because a read meets the requirement "0 mismatch in the seed of 20 nucleotides" that the read will map! So, even if the first 20 nucleotides align perfectly, it does not mean that the 40 following nucleotides are good. ;)

read = ATGCAATT GCATGGACATCGA

ref = ATGCAATT AATTAAGGCCAATT

The read will not match even if you use the options -l 8 -n 0.

That is the reason why you don't have the same results with your trimmed reads.

To deal with this, you can try the -v option. Which allow you to set a number of mismatch over all the read (-l is ignored).

Hope this helps!

Best, glihm

ADD COMMENT

Login before adding your answer.

Traffic: 2206 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6