I have some questions concerning the output of Trimmomatic after adapter removal. I have 80 bp paired-end reads in Ilumina 1.9 encoding (Phred+33). Using FastQC for quality control, I noticed some overrepresented sequences in the data which were identified as TruSeq adapters. For this reason, I used Trimmomatic in order to trim the adapters and to drop any resulting reads with a length < 36 bp:
java -jar trimmomatic-0.38.jar PE -phred33 seq_1.fastq.gz seq_2.fastq.gz seq_1_trimmed_paired.fastq.gz seq_2_trimmed_unpaired.fastq.gz seq_1_trimmed_paired.fastq.gz seq_2_trimmed_unpaired.fastq.gz ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 MINLEN:36
As a result, I get ~99% of both reads surviving and ~1% forward reads only surviving and 0% reverse reads only surviving:
ILLUMINACLIP: Using 1 prefix pairs, 0 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences Input Read Pairs: 71446282 Both Surviving: 70983555 (99.35%) Forward Only Surviving: 453784 (0.64%) Reverse Only Surviving: 0 (0.00%) Dropped: 8943 (0.01%)
Is the 0% reverse only surviving the expected result? It seems that the reverse reads are the only ones affected by the adapter trimming. However, in the FastQC quality control, the warnings for overrepresented adapter sequences only showed up for the forward reads.
And what is the difference between the TruSeq3-PE.fa and the reverse complements TruSeq3-PE-2.fa adapter sequence files and which of them should actually be used to trim adapters from paired-end reads?
I would be very grateful for any help or explanations.