Question: Suggested Trimmomatic inputs for LEADING and TAILING
0
gravatar for dadrasarmin
2 days ago by
dadrasarmin0 wrote:

Hi, I decided to use Trimmomatic for trimming raw reads. I saw many researchers used the inputs (LEADING:3 TRAILING:3) authors suggested on their webpage (https://www.usadellab.org/cms/?page=trimmomatic) for trimming RNA-Seq reads.

I thought these values are very low, so I started to find out their exact meaning. A website (http://drive5.com/usearch/manual/quality_score.html) used the below figure to explain the Phred (quality score), and stated that: "Note that a Q score of 3 means P (the error probability)=0.5, meaning that there is a 50% chance the base is wrong, and lower values represent even higher probabilities of error." Can anyone explain why you suggested such low criteria for filtering reads? Best, Armin

Quality score and the probability of error

ADD COMMENTlink written 2 days ago by dadrasarmin0

First of all do you have a specific problem with beginning or ends of reads as far as quality? If you don't then you don't have to strictly trim them. Poor quality bases (if they are adapter/contaminants) can be soft-clipped by aligners so don't strictly require trimming. That said, if you are planning to do any de novo assembly work then trimming the data more stringently at Q15 or higher may be warranted.

ADD REPLYlink modified 2 days ago • written 2 days ago by GenoMax96k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1527 users visited in the last hour
_