What is the quality encoding in the input reads that BWA expects as default? Is it Sanger, Solexa, Illumina 1.3+, Illumina 1.5+ or Illumina 1.8+ (as per the section "Encoding" found in this Wikipedia article). Also, is it true that BWA doesn't really use the quality values for finding matches? What is the usefullness then, of the "-I" parameter in bwa aln? How are the quality values used by BWA?
What if I have reads generated by the new Illumina 1.8 pipeline? Should I somehow convert qualities before feeding them to BWA? I'm asking because I saw that quality range in 1.8 differs significantly compared to both 1.3 and 1.5.
For what happens if you incorrectly set -I, see Seeing unexpected characters (^D,^Q) in the QUAL field of a SAM file
hehe, nice pointer, we have an answer for everything