Hi everyone,
I'm a wet lab biologist who now has 100+ RNA-seq samples to analyze so it's been a steep learning curve. Any help would be super appreciated!
I have PE 125bp fastq files from an Illumina Hiseq and my fastqc analysis shows Illumina Universal Adapter contamination. I used trimmomatic to try and remove them. I used default settings I saw in the trimmomatic manual even though I don't really need quality trimming (all high quality bases according to fastqc).
java -jar $EBROOTTRIMMOMATIC/trimmomatic-0.36.jar PE R1_001.fastq.gz R2_002.fastq.gz R1_paired.fastq.gz R1_unpaired.fastq.gz R2_paired.fastq.gz R2_unpaired.fastq.gz ILLUMINACLIP:$EBROOTTRIMMOMATIC/adapters/TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:25
After trimming I notice that while the adapter contamination is much better, it's not all removed?? Also, I would go from ~17 million reads (all 125bp long) to ~14 million reads (almost all 124bp long). That doesn't seem like it's working properly. Below I've attached a Multiqc report of before and after trimming (paired files only).
Multiqc Adapter contamination of trimmed vs untrimmed reads

Useful for future reference: How to add images to a Biostars post I have done it for you this time.
I am going to suggest that you try
bbduk.shfrom BBMap suite instead oftrimmomatic. Guide on how to use it can be found here.Look at this thread for help on how to write a
bashloop to process those 100+ samples efficiently: Bash Script Loop HelpAh, thank you very much, both for the answer and for the extra help!
I will try bbduk.sh and see how it goes. I used trimmomatic because it was a tool already available on the cloud cluster I'm using (compute canada). However, is there a specific reason to use bbduk.sh instead of trimmomatic other than "if one tool doesn't work, try another"?
I am not a regular
trimmomaticuser so I won't comment on why it seems to be missing an obvious set of adapter sequences. Are you using the right adapter sequence file withtrimmomaticbased on the kind of library you have?bbduk.shuses a singleadapters.fafile that you can find inresourcesdirectory which contains all commonly used commercial adapter sequences. While there may a bit of over-trimming possible it is easier to use a single file. Options are easier to understand and use. It is just a matter of what one gets used to. Both program should in theory work the same.