Hi everyone,
I think this question has been coming up a few times but it didn't help me solve my issue. I have FastQ files from ITS amplicon based metagenomic sequencing (ITS1/2) (300 bp) and FastQC tells me that they all have Nextera transposase adapters. The adapter contamination starts already relatively early, for example at 50 bp or 100 bp.
I used trimmomatic 0.39 to remove this contamination without any further quality trimming. My settings are:
java -jar trimmomatic-0.39.jar PE {R1_file} {R2_file} {R1_paired} {R1_unpaired} {R2_paired} {R2_unpaired} ILLUMINACLIP:NexteraPE-PE.fa:5:10:5
My next step would be to follow the dada2 ITS pipeline.
However, trimmomatic removes a lot of reads, usually between 40% and 100% per file. dada2 won't even work on my files because some paired R2 files are missing, as all reads were dropped. Similar issue when using cutadapt.
Can anyone explain what went wrong here and how I can fix this? Thank you.
Some of these programs have a minimum size threshold for a read to be retained. You could play around with that to keep all trimmed reads 50nt or longer. You could also try a program that allows you to manually input the adaptor sequence (Cutadapt does this).
Trimmomatic allows to enter custom adapter sequences, but I don't see how this would change the outcome. Similarly, trimmomatic also allows to set minimum length of reads to keep using the setting "MINLEN:50". Previously, I didn't use it and it shouldn't have been active when not using it. Using the setting MINLEN:50, I loose just as many reads as without it:
ILLUMINACLIP: Using 1 prefix pairs, 4 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences Quality encoding detected as phred33 Input Read Pairs: 522798 Both Surviving: 90331 (17.28%) Forward Only Surviving: 301450 (57.66%) Reverse Only Surviving: 694 (0.13%) Dropped: 130323 (24.93%)
ILLUMINACLIP: Using 1 prefix pairs, 4 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences Quality encoding detected as phred33 Input Read Pairs: 285721 Both Surviving: 105796 (37.03%) Forward Only Surviving: 168809 (59.08%) Reverse Only Surviving: 117 (0.04%) Dropped: 10999 (3.85%)
etc etc