Question: quality trimming with trimmomatic
2
gravatar for blooming.daisy333
19 months ago by
blooming.daisy33370 wrote:

I am using trimmomatic for quality trimming of fastq files for my project. im usig default values for paired end reads that trim the sequences at "phred 33". can anyone help me to know how to change the phred value from default 33 to 20 or any other value for paired and single reads??? the command line im using is given below:

java -jar trimmomatic-0.36.jar PE /data/memo/SRR9590_1.fastq \
                    /data/memona/SRR9590_2.fastq SRR959590_A_1P.fq SRR9590_A_1U.fq \
                    SRR9590_A_2P.fq SRR9590_A_2U.fq \
                    ILLUMINACLIP:/data/memona/Trimmomatic-0.36/adapters/TruSeq3-PE.fa:2:30:10 \
                    LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
next-gen • 3.9k views
ADD COMMENTlink modified 19 months ago by RamRS25k • written 19 months ago by blooming.daisy33370

phred 33, is your imported file encoding format, actually. trimmomatic can not recommend you such parameters to trim, it's only a trimmer.

I recommend you 123Fastq which combine FASTQC and trimmomatic in a highly interactive graphical user interface. 123Fastq can suggest you recommendation to trim based on your QC results. it also added some improvements to QC modules of FASTQC, added a Kmer-based approach to remove adapters in the trimming, and many other features. try it your own: https://sourceforge.net/projects/project-123ngs/

ADD REPLYlink written 3 months ago by genetician201610
2
gravatar for finswimmer
19 months ago by
finswimmer13k
Germany
finswimmer13k wrote:

Hello daisy,

please read again the manual carefully, especially what the meaning of the single parameters is.

With the command you provide you don't trim at a quality value lower 33. You do:

Remove adapters (ILLUMINACLIP:TruSeq3-PE.fa:2:30:10)

Remove leading low quality or N bases (below quality 3) (LEADING:3)

Remove trailing low quality or N bases (below quality 3) (TRAILING:3)

Scan the read with a 4-base wide sliding window, cutting when the average quality per base drops below 15 (SLIDINGWINDOW:4:15)

Drop reads below the 36 bases long (MINLEN:36)

fin swimmer

ADD COMMENTlink modified 19 months ago by h.mon29k • written 19 months ago by finswimmer13k

Dear Fin Swimmer, when i processed the reads it has been mentioned in the output of terminal that the phred score is 33 (not phred encoding). im unable to figure out from manual how to adjust the value of phred. any kind help plz???

ADD REPLYlink written 19 months ago by blooming.daisy33370

Please post the complete output of trimmomatic.

fin swimmer

ADD REPLYlink written 19 months ago by finswimmer13k

Dear Swimmer, here is the output of trimmomatic

[memona@farooq Trimmomatic-0.36]$ java -jar trimmomatic-0.36.jar PE /data/memona/SRR959590_1.fastq /data/memona/SRR959590_2.fastq SRR959590_B_1P.fq SRR959590_B_1U.fq SRR959590_B_2P.fp SRR959590_B_2U.fq ILLUMINACLIP:/data/memona/Trimmomatic-0.36/adapters/TruSeq3-PE.fa:3:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
TrimmomaticPE: Started with arguments:
 /data/memona/SRR959590_1.fastq /data/memona/SRR959590_2.fastq SRR959590_B_1P.fq SRR959590_B_1U.fq SRR959590_B_2P.fp SRR959590_B_2U.fq ILLUMINACLIP:/data/memona/Trimmomatic-0.36/adapters/TruSeq3-PE.fa:3:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
Using PrefixPair: 'TACACTCTTTCCCTACACGACGCTCTTCCGATCT' and 'GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT'
ILLUMINACLIP: Using 1 prefix pairs, 0 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
Quality encoding detected as phred33
Input Read Pairs: 20564050 Both Surviving: 12315568 (59.89%) Forward Only Surviving: 5069479 (24.65%) Reverse Only Surviving: 100165 (0.49%) Dropped: 3078838 (14.97%)
TrimmomaticPE: Completed successfully

i just have noticed that its phred encoding not the phred quality. Can you please guid me how to include the phred quality values in command???\ thanks

ADD REPLYlink written 19 months ago by blooming.daisy33370
1

Please refer to the manual.

LEADING, TRAILING and SLIDINGWINDOW are your friends.

And beside this, we could start a discussion whether quality trimming is realy neccessary. My opinion is, if your overall quality is fine, than there is no reason to throw away informations. And even if it's bad, I would first try an error correction with a tool like clumpify or bbmerge from the BBTools.

fin swimmer

ADD REPLYlink modified 19 months ago by zx87548.8k • written 19 months ago by finswimmer13k

Hi finswimmer,

I agree with what you write, but please take into account how you write it, and how it may get interpreted by other users, especially those who are not native English speakers. Your post was edited to remove a part which may come across as condescending. We like to keep biostars a friendly place and are happy with your contributions!

Cheers,
Wouter

ADD REPLYlink written 19 months ago by WouterDeCoster42k

Hi WouterDeCoster,

I'm sorry. It wasn't meant condescending (had to look at the dictionary first what this mean ;)) in any way. It should be more a prompt to be a little more proactive as I already pointed to the manual (which is good to read) and pasted the most important part.

fin swimmer

ADD REPLYlink modified 19 months ago • written 19 months ago by finswimmer13k
1
gravatar for chen
19 months ago by
chen1.9k
OpenGene
chen1.9k wrote:

I suggest you to try another quality profiling and trimming tool fastp, which is easier to use and is 3x faster than Trimmomatic.

ADD COMMENTlink written 19 months ago by chen1.9k
1

Thanks for the recommendation, but you suggestion could use a disclaimer.

ADD REPLYlink written 19 months ago by WouterDeCoster42k
3

I'd like to clarify this a little further and tell you about how I deal with representing a (free) resource on here.

The first thing you'll see is my name "Emily Ensembl" – I've made it completely obvious that I work for Ensembl, this is my disclaimer. Everybody knows that when I recommend Ensembl for a task, it's because I work for them. I can see that you've put your resource as your location, which is almost as good, as it shows up on posts and answers (like it's done here), but it won't show up if you make any comments, so I think it is better to put your resource in your name rather than your location.

Secondly, I never recommend Ensembl when someone has already started using something else. If someone says "I want help with this UCSC thing" I will never say "You should use Ensembl instead". But if they say "I want to do this and don't know what tool to use", I think it's perfectly fine to suggest Ensembl. The only circumstance is if maybe the tool they were trying to use was completely unsuitable for the task.

Lastly, if I do recommend Ensembl, I make sure I name Ensembl. In your post you just provide a link to the tool, and it's only clear that that's the tool you work for once you've clicked on the link. If you'd have said "I suggest you to try another quality profiling and trimming tool fastp from OpenGene", along with your location at OpenGene, it would be clear why you're recommending it and nobody would mind.

Nobody here minds people promoting their tools (especially if they're good tools and free) but we like people to be open that that's what they're doing.

ADD REPLYlink modified 19 months ago • written 19 months ago by Emily_Ensembl20k

Thank you, Emily. Your suggestion is good.

ADD REPLYlink written 19 months ago by chen1.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1181 users visited in the last hour