Question: Leftover adapter sequences after Trimmomatic trimming?
1
gravatar for annsun89
17 months ago by
annsun8920
annsun8920 wrote:

Hi everyone,

I'm a wet lab biologist who now has 100+ RNA-seq samples to analyze so it's been a steep learning curve. Any help would be super appreciated!

I have PE 125bp fastq files from an Illumina Hiseq and my fastqc analysis shows Illumina Universal Adapter contamination. I used trimmomatic to try and remove them. I used default settings I saw in the trimmomatic manual even though I don't really need quality trimming (all high quality bases according to fastqc).

java -jar $EBROOTTRIMMOMATIC/trimmomatic-0.36.jar PE R1_001.fastq.gz R2_002.fastq.gz R1_paired.fastq.gz R1_unpaired.fastq.gz R2_paired.fastq.gz R2_unpaired.fastq.gz ILLUMINACLIP:$EBROOTTRIMMOMATIC/adapters/TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:25

After trimming I notice that while the adapter contamination is much better, it's not all removed?? Also, I would go from ~17 million reads (all 125bp long) to ~14 million reads (almost all 124bp long). That doesn't seem like it's working properly. Below I've attached a Multiqc report of before and after trimming (paired files only).

Multiqc Adapter contamination of trimmed vs untrimmed reads

Screen Shot 2018 07 05 at 11 18 54 AM

ADD COMMENTlink modified 17 months ago by genomax75k • written 17 months ago by annsun8920
1

Useful for future reference: How to add images to a Biostars post I have done it for you this time.

I am going to suggest that you try bbduk.sh from BBMap suite instead of trimmomatic. Guide on how to use it can be found here.

Look at this thread for help on how to write a bash loop to process those 100+ samples efficiently: Bash Script Loop Help

ADD REPLYlink modified 17 months ago • written 17 months ago by genomax75k

Ah, thank you very much, both for the answer and for the extra help!

I will try bbduk.sh and see how it goes. I used trimmomatic because it was a tool already available on the cloud cluster I'm using (compute canada). However, is there a specific reason to use bbduk.sh instead of trimmomatic other than "if one tool doesn't work, try another"?

ADD REPLYlink written 17 months ago by annsun8920

I am not a regular trimmomatic user so I won't comment on why it seems to be missing an obvious set of adapter sequences. Are you using the right adapter sequence file with trimmomatic based on the kind of library you have?

bbduk.sh uses a single adapters.fa file that you can find in resources directory which contains all commonly used commercial adapter sequences. While there may a bit of over-trimming possible it is easier to use a single file. Options are easier to understand and use. It is just a matter of what one gets used to. Both program should in theory work the same.

ADD REPLYlink written 17 months ago by genomax75k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 860 users visited in the last hour