This is kind of follow-up thread for: http://biostar.stackexchange.com/questions/14723/trim-the-low-quality-end-of-100bp-read
I tried to run
bwa -aln -I -q 20 to trim off the low-quality sequences at the right end of reads.
However, my concern is: for example, for some 100bp reads, there'll only be, say, 30 bp left after trimming (means 70bp from right end to left are kind of crappy). Then BWA just tried to map this 30bp onto the reference genome.
Well, 30bp is still enough to uniquely locate the read; but I would think, doesn't a 100bp read with only 30bp in good shape suggest the whole read is unreliable? So I would just throw the whole read.
Thus I would like to first trim off the trailing "B" by myself, but next question is: what's the cutoff for the percentage of good-shape sequences out of the whole 100bp? I hope to set as 70%; but simply my personal feeling.