Bowtie Results Not Matching The Expected Error Qualities
1
0
Entering edit mode
11.6 years ago
Xinwei Han • 0

I stumbled upon this problem when playing with different flags of bowtie. I am not sure whether this is a bug or not. I extracted a read from my dataset (shown below) and put that in a file (test.fastq). It is in Sanger Fastq format.

@SRR218096.75 HWUSI-EAS465:3:1:3:839 length=36
ACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
+SRR218096.75 HWUSI-EAS465:3:1:3:839 length=36
..1>@<.;>2>@@>2;@>;@@@@@@;@@@@;@BBBA

Then I used the following command to map this read to Arabidopsis Tair10 genome. I made the index by using:

bowtie-build tair10.fasta TAIR10

tair10.fasta is just a concatenated file of *.fas from ftp://ftp.arabidopsis.org/home/tair/Sequences/whole_chromosomes/. Then map:

bowtie -a -m 25 -n 3 -e 60 --best --strata --sam TAIR10 test.fastq test

The alignment output is:

SRR218096.75 16 Chr3 2094491 255 36M * 0 0 GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGT ABBB@;@@@@;@@@@@@;>@;2>@@>2>;.<@>1.. XA:i:2 MD:Z:7C21T5G0 NM:i:3
SRR218096.75 16 Chr3 2094494 255 36M * 0 0 GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGT ABBB@;@@@@;@@@@@@;>@;2>@@>2>;.<@>1.. XA:i:2 MD:Z:4C21T8G0 NM:i:3
SRR218096.75 16 Chr3 2094504 255 36M * 0 0 GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGT ABBB@;@@@@;@@@@@@;>@;2>@@>2>;.<@>1.. XA:i:2 MD:Z:16T11A7 NM:i:2

However, by manually checking the sum of sequencing qualities in mismatched positions, I found the second alignment result with "4C21T8G0" actually has the sum exceeding 60, although I specified -e 60. ASCII code representing sequencing qualities in 3 mismatched positions are "@", "2" and "." . They correspond to 31, 17 and 13 in quality score. So the sum is 61 despite of -e 60. Please let me know if I made a mistake somewhere.

I am using bowtie 0.12.8.

bowtie • 3.2k views
ADD COMMENT
0
Entering edit mode

Just off the cuff, do you know specifically what quality scale was used during the generation of your data? (as there are a variety in use, e.g. http://en.wikipedia.org/wiki/FASTQ_format )

ADD REPLY
0
Entering edit mode

It is Sanger Fastq. So it "encode a Phred quality score from 0 to 93 using ASCII 33 to 126".

ADD REPLY
1
Entering edit mode
11.6 years ago

The manual says that it rounds to the nearest 10:

-e/--maqerr <int> Maximum permitted total of quality values at all mismatched read positions throughout the entire alignment, not just in the "seed". The default is 70. Like Maq, bowtie rounds quality values to the nearest 10 and saturates at 30; rounding can be disabled with --nomaqround.

so that means

31 + 17 + 13 = 61

30 + 20 + 10 = 60
ADD COMMENT
0
Entering edit mode

Thank you so much, Istvan. I was in Penn State and attended your seminar several times. Thanks again.

ADD REPLY

Login before adding your answer.

Traffic: 1851 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6