I'm a wet lab biologist who now has 100+ RNA-seq samples to analyze so it's been a steep learning curve. Any help would be super appreciated!
I have PE 125bp fastq files from an Illumina Hiseq and my fastqc analysis shows Illumina Universal Adapter contamination. I used trimmomatic to try and remove them. I used default settings I saw in the trimmomatic manual even though I don't really need quality trimming (all high quality bases according to fastqc).
java -jar $EBROOTTRIMMOMATIC/trimmomatic-0.36.jar PE R1_001.fastq.gz R2_002.fastq.gz R1_paired.fastq.gz R1_unpaired.fastq.gz R2_paired.fastq.gz R2_unpaired.fastq.gz ILLUMINACLIP:$EBROOTTRIMMOMATIC/adapters/TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:25
After trimming I notice that while the adapter contamination is much better, it's not all removed?? Also, I would go from ~17 million reads (all 125bp long) to ~14 million reads (almost all 124bp long). That doesn't seem like it's working properly. Below I've attached a Multiqc report of before and after trimming (paired files only).
Multiqc Adapter contamination of trimmed vs untrimmed reads