Understanding Trimmomatic Sliding Window Approach
2
1
Entering edit mode
6.5 years ago

Hello all,

I am performing de novo transcriptome assembly. I have used Trimmomatic to quality filter my reads. I used the argument: SLIDINGWINDOW:4:30 Can someone explain what this means?

My understanding is that the sliding window approach will cut the read when the average quality of each 4-nt window falls below a quality score of 30.

I guess I am getting confused on the "cut the read" part. If someone could clarify this, it would be much appreciated! Also, is the quality score of 30 a phred score?

Thanks for the help! Nikelle

RNA-Seq trimmomatic quality control qc • 5.5k views
ADD COMMENT
3
Entering edit mode

See if the answer here clarifies the concept. Once the condition being checked becomes true (Q score < 30 for window of 4 nt) the remaining nucloetides in the read would be cut.

ADD REPLY
1
Entering edit mode

Thanks Genomax, that link certainly helps. isDo you know if that Q score of 30 is the same as calling it a Phred score of 30?

ADD REPLY
1
Entering edit mode

For more on that (for Illumina) see this document.

ADD REPLY
2
Entering edit mode
6.5 years ago
agata88 ▴ 850

If average coverage of quality for 4 bases is lower than 30, then program will cut this 4 bases off. You can define which encoding you have in your files, by adding -phred33 option or -phred64. If your reads are encoded phred+33 (Illumina 1.8 + ) then your nucleotides have quality from 0-41. The cut off for this encoding quality is usually 30.

And FASTQC can help you to identity the encoding of your reads.

Hope it helps,

Best,

Agata

ADD COMMENT
1
Entering edit mode

Thanks Agata!

Do you know how i would use fastqc to determine this? Also, do I need to specify how the files were encoded or can trimmomatic figure this out automatically?

Thank you, Nikelle

ADD REPLY
2
Entering edit mode

You can download FastQC from here to your computer: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/

Or use Galaxy for that,

https://usegalaxy.org/

The input is your R1 and R2 files if you have PE reads. After processing in Basic Statistics will be information about encoding of your reads. Beside that you can compare trimming results running FastQC for fastq files after Trimmomatic, then you'll see if all is correct.

Trimmomatic is not doing this automatically as far as I know, Best, Agata

ADD REPLY
2
Entering edit mode

Unless you have data that was generated 4+ years ago it is going to be in Sanger fastq (phred+33) format.

ADD REPLY
1
Entering edit mode
6.5 years ago
biomaster ▴ 180

For QC and read filtering, a tool gives me best experience is AfterQC (https://github.com/OpenGene/AfterQC), do QC and filtering automatically, in a single pass, with pair-end fastq supported.

ADD COMMENT

Login before adding your answer.

Traffic: 1546 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6