I have single-end sequencing data prepared using the Illumina Nextera library prep kit. In my FastQC plots I can see adapter contamination at the 3' end of my reads and some N base calls at the 5' end. I run Trimmomatic in single-end mode using the Nextera adapters file provided, plus the Nextera transposase sequence FastQC uses (including its reverse complement):
TrimmomaticSE: Started with arguments: -phred33 -threads 1 data/raw_reads/LT119/160418_D00248_0165_AC931NANXX_8_NX-P7-008_NX-P5-017.fastq.gz data/trim_reads/LT119/160418_D00248_0165_AC931NANXX_8_NX-P7-008_NX-P5-017_trim.fastq.gz ILLUMINACLIP:/home/jashmore/anaconda3/share/trimmomatic-0.36-3/adapters/Nextera.fasta:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36 Using Long Clipping Sequence: 'GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG' Using Long Clipping Sequence: 'TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG' Using Short Clipping Sequence: 'CTGTCTCTTATA' Using Medium Clipping Sequence: 'AGATGTGTATAAGAGACAG' Using Short Clipping Sequence: 'TATAAGAGACAG' Using Short Clipping Sequence: 'TCCTCGGCCG' Using Medium Clipping Sequence: 'GGTCGCGGCCGAGGATC' Using Medium Clipping Sequence: 'CTGTCTCTTATACACATCT' Using Short Clipping Sequence: 'CGGCCGAGGA' Using Medium Clipping Sequence: 'GATCCTCGGCCGCGACC' Using Long Clipping Sequence: 'TCCTCGGCCGCGACCACGCTGCCCTATAGTGAGTCGTATTAG' Using Long Clipping Sequence: 'CTAATACGACTCACTATAGGGCAGCGTGGTCGCGGCCGAGGA' Using Long Clipping Sequence: 'CTGTCTCTTATACACATCTCCGAGCCCACGAGAC' Using Long Clipping Sequence: 'CTGTCTCTTATACACATCTGACGCTGCCGACGA' ILLUMINACLIP: Using 0 prefix pairs, 14 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences Input Reads: 3093105 Surviving: 3073868 (99.38%) Dropped: 19237 (0.62%) TrimmomaticSE: Completed successfully
After trimming I can see that the adapter contamination decreases (but isn't completely removed), and that the N base calls are still present at the 5' end. Could anyone explain why this is, or what I'm doing wrong? Granted, the amount of contamination is ~1% and shouldn't be too detrimental to my mapping, I'd still like to work out why.