Trimmomatic not removing all Nextera adapters or N base calls
0
2
Entering edit mode
8.0 years ago
James Ashmore ★ 3.4k

I have single-end sequencing data prepared using the Illumina Nextera library prep kit. In my FastQC plots I can see adapter contamination at the 3' end of my reads and some N base calls at the 5' end. I run Trimmomatic in single-end mode using the Nextera adapters file provided, plus the Nextera transposase sequence FastQC uses (including its reverse complement):

TrimmomaticSE: Started with arguments:
 -phred33 -threads 1 data/raw_reads/LT119/160418_D00248_0165_AC931NANXX_8_NX-P7-008_NX-P5-017.fastq.gz data/trim_reads/LT119/160418_D00248_0165_AC931NANXX_8_NX-P7-008_NX-P5-017_trim.fastq.gz ILLUMINACLIP:/home/jashmore/anaconda3/share/trimmomatic-0.36-3/adapters/Nextera.fasta:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
Using Long Clipping Sequence: 'GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG'
Using Long Clipping Sequence: 'TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG'
Using Short Clipping Sequence: 'CTGTCTCTTATA'
Using Medium Clipping Sequence: 'AGATGTGTATAAGAGACAG'
Using Short Clipping Sequence: 'TATAAGAGACAG'
Using Short Clipping Sequence: 'TCCTCGGCCG'
Using Medium Clipping Sequence: 'GGTCGCGGCCGAGGATC'
Using Medium Clipping Sequence: 'CTGTCTCTTATACACATCT'
Using Short Clipping Sequence: 'CGGCCGAGGA'
Using Medium Clipping Sequence: 'GATCCTCGGCCGCGACC'
Using Long Clipping Sequence: 'TCCTCGGCCGCGACCACGCTGCCCTATAGTGAGTCGTATTAG'
Using Long Clipping Sequence: 'CTAATACGACTCACTATAGGGCAGCGTGGTCGCGGCCGAGGA'
Using Long Clipping Sequence: 'CTGTCTCTTATACACATCTCCGAGCCCACGAGAC'
Using Long Clipping Sequence: 'CTGTCTCTTATACACATCTGACGCTGCCGACGA'
ILLUMINACLIP: Using 0 prefix pairs, 14 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
Input Reads: 3093105 Surviving: 3073868 (99.38%) Dropped: 19237 (0.62%)
TrimmomaticSE: Completed successfully

After trimming I can see that the adapter contamination decreases (but isn't completely removed), and that the N base calls are still present at the 5' end. Could anyone explain why this is, or what I'm doing wrong? Granted, the amount of contamination is ~1% and shouldn't be too detrimental to my mapping, I'd still like to work out why.

trimming nextera trimmomatic • 7.3k views
ADD COMMENT
0
Entering edit mode

I've worked out that the N bases near the beginning of the reads won't be removed because they appear at base 2 - Trimmomatic starts at base 1 and checks if it is below the threshold (which it isn't) so it does not trim and does not move on to base 2... still not sure why the Nextera adapters are not being removed.

ADD REPLY
0
Entering edit mode

Hi James,

I am having the same problem with Nextera adapters. After running trimmomatic, I still see contamination at the 3' end. Did you figure out how to deal with it? Thanks.

ADD REPLY
1
Entering edit mode

I ended up using cutadapt instead. The following command seems to work fine for me:

cutadapt -g PrefixNX/1=AGATGTGTATAAGAGACAG \
         -a PrefixNX/1_rc=CTGTCTCTTATACACATCT \
         -g PrefixNX/2=AGATGTGTATAAGAGACAG \
         -a PrefixNC/2_rc=CTGTCTCTTATACACATCT \
         -g Trans1=TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG \
         -a Trans1_rc=CTGTCTCTTATACACATCTGACGCTGCCGACGA \
         -g Trans2=GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG \
         -a Trans2_rc=CTGTCTCTTATACACATCTCCGAGCCCACGAGAC \
         -q 20 \
         -n 5 \
         --trim-n \
         -m 20 \
         -o ${OUTPUT_FASTQ} ${INPUT_FASTQ} > ${TRIM_LOG}
ADD REPLY
0
Entering edit mode

I tried trim galore and it works for me. Thanks James!!

ADD REPLY

Login before adding your answer.

Traffic: 1880 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6