Question: Suggested Trimmomatic inputs for LEADING and TAILING
0
dadrasarmin • 0 wrote:
Hi, I decided to use Trimmomatic for trimming raw reads. I saw many researchers used the inputs (LEADING:3 TRAILING:3) authors suggested on their webpage (https://www.usadellab.org/cms/?page=trimmomatic) for trimming RNA-Seq reads.
I thought these values are very low, so I started to find out their exact meaning. A website (http://drive5.com/usearch/manual/quality_score.html) used the below figure to explain the Phred (quality score), and stated that: "Note that a Q score of 3 means P (the error probability)=0.5, meaning that there is a 50% chance the base is wrong, and lower values represent even higher probabilities of error." Can anyone explain why you suggested such low criteria for filtering reads? Best, Armin
First of all do you have a specific problem with beginning or ends of reads as far as quality? If you don't then you don't have to strictly trim them. Poor quality bases (if they are adapter/contaminants) can be soft-clipped by aligners so don't strictly require trimming. That said, if you are planning to do any de novo assembly work then trimming the data more stringently at Q15 or higher may be warranted.