I've begun pre-processing of my paired-end RNA seq data (run on Illumina HiSeq).
After running fastqc on my samples, I noticed some have overrepresented sequences corresponding to adaptors.
I've been trying to use Trimmomatic to remove the adaptors, however, after Trimming I get MORE over represented reads than I do before trimming! I'm not sure what's going on.
For instance, in my unprocessed read, I'll have a single overpresented sequence corresponding to adapter index 1. Once trimmed and processed by trimmomatic, I'll have 25 overrepresented sequences, all corresponding to different variants of the adapter index 1 sequence.
Here is my command line:
TrimmomaticPE -phred33 /R1_001.fastq.gz /R2_001.fastq.gz /R1_pairedout /R1_unpairedout /R2_pairedout /R2_unpairedout ILLUMINACLIP:/TruSeq3-PE.fa:2:30:10 LEADING:5 TRAILING:5 AVGQUAL:20
Any idea what I'm doing wrong? The same thing occurs even if I leave out the ILLUMINACLIP line.