Empty output trimmomatic
1
0
Entering edit mode
8.5 years ago

Dear all,
I am applying trimmomatric to trim fastaq files by quality and to remove the adapters. I have two paired files seq1.1.fq and seq1.2.fq with nextera adapters so I ran the following command:
java -jar trimmomatic-0.33.jar PE -threads 16 -phred64 seq1.1.fq seq1.2.fq pairedOutup1 pairedOutup2 unpairedOutup1 unpairedOutup2 ILLUMINACLIP:NexteraPE-PE.fa:2:30:10:1:true LEADING:5 TRAILING:5 SLIDINGWINDOW:4:15 MINLEN:36

The command is executed with the following display:
Using PrefixPair: 'AGATGTGTATAAGAGACAG' and 'AGATGTGTATAAGAGACAG'
Using Long Clipping Sequence: 'GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG'
Using Long Clipping Sequence: 'TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG'
Using Long Clipping Sequence: 'CTGTCTCTTATACACATCTGACGCTGCCGACGA'
Using Long Clipping Sequence: 'CTGTCTCTTATACACATCTCCGAGCCCACGAGAC'
ILLUMINACLIP: Using 1 prefix pairs, 4 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
Input Read Pairs: 947710 Both Surviving: 0 (0.00%) Forward Only Surviving: 0 (0.00%) Reverse Only Surviving: 0 (0.00%) Dropped: 947710 (100.00%)
TrimmomaticPE: Completed successfully

However all the output files are completely empty.

The first lines of the input are:
{seq1.1.fq}
@M03595:11:000000000-AG58B:1:1101:16029:1738 1:N:0:26
ATTGTTAATCGTAAAGCAATGTTCATTCCGATTGTGGCTGTTGCAAGTTTTATGCTTGTAGGTTATGCTGCAACCGATAAAGAAATGCCGGAAATTAGATCTAATCAAATTGAAGTTC
+
1>A1AF33DD1AA113B11B1GGE3FGEF00EEAG20AFCGH1A111FGGH2G21FHBGB21FG1F11F1101BB//E//1@100D1@B/////BG111BBG11FE111BG111BFGE
@M03595:11:000000000-AG58B:1:1101:14217:1754 1:N:0:26
GTTGGCCATAAGGCTGTTGGTGCGATAGTTAATAATGTGATGGTTCCGATCGATACAAAATTAAATACGGGTGATGTCGTAGAAATCAAGACAAATAAACAGTCACAG
+
1AA11@11C1111BF1GG11A0100A00DF22D22D2D22D21BD1B//B///A/A1110BG111F2A///>//FBFFAFA//21BB1111>000B111@10BF1@11
@M03595:11:000000000-AG58B:1:1101:13810:1764 1:N:0:26
GTTGAGACTGTGGATGGTATCAGCGGGTATTGCATGAGTGAGTTTATAAAACTCTGTTAG
+
...

{seq1.2.fq}
@M03595:11:000000000-AG58B:1:1101:16029:1738 2:N:0:26
TTACTTCAATTTGTTTATTTCTAATTTCCGGCATTTCTTTATCGGTTGCAGCATAACCTACAAGCATATAACTTGCAACAGCCACAATCGGAATGAACATTGCTTTACGATTAACAAT
+
111>>D@31BDF33BB333BAB33DFG3A00A0AFGDGGH2FEA0BE/01110B1111D1A111/0D1222BDG1111B000>0B0/B1////B@11@1BF11GHHFE//FG?1@11B
@M03595:11:000000000-AG58B:1:1101:14217:1754 2:N:0:26
CTGTGACTGTTTCTTTGTCTTGATTTCTTCTACTTCACCCGTATTTAATTTTGTTTCTATCGGTTCCTTCACATTATTAACTATCGCTCCAACTGCCTTATGGCCAAC
+
1>1>13BB1FDF3BBG3BAFG13DFGAF333333D331AA0B0BFG22DDGH2B0B222DA/////12DA1A2DF1DG22AFDGE//0A>11100@0BD1B11/01/>
@M03595:11:000000000-AG58B:1:1101:13810:1764 2:N:0:26
CTAACAGAGTTTTATATTCTCACTCATGCAATACCCGCTGATACCATCCACATTCTCAAC
+

What might be the issue? maybe the quality is so low that all the sequences are removed?
Thank you.

trimmomatic output • 4.4k views
ADD COMMENT
0
Entering edit mode

Hello,

I ran fastqc on the input files and got the following:

{seq1.1.fq}

PASS    Basic Statistics
PASS    Per base sequence quality
PASS    Per tile sequence quality
PASS    Per sequence quality scores
FAIL    Per base sequence content
WARN    Per sequence GC content
PASS    Per base N content
WARN    Sequence Length Distribution
FAIL    Sequence Duplication Levels
WARN    Overrepresented sequences
PASS    Adapter Content
FAIL    Kmer Content

{seq.1.2.fq}

PASS    Basic Statistics
PASS    Per base sequence quality
PASS    Per tile sequence quality
PASS    Per sequence quality scores
FAIL    Per base sequence content
WARN    Per sequence GC content
PASS    Per base N content
WARN    Sequence Length Distribution
FAIL    Sequence Duplication Levels
WARN    Overrepresented sequences
PASS    Adapter Content
FAIL    Kmer Content

I could actually run the command omitting the SLIDINGWINDOW option (!). Look from the fastqc analysis that the adapters were removed already, so trimmomatic simply omitted such step and went to the trimming for quality step, is that assumption correct or I should not run trimming for adapters on sequences already adapter-cleaned?

Thank you

ADD REPLY
0
Entering edit mode

FasqQC is not the most sensitive tool for finding adapters. You should check Trimmomatic output carefully, it will report the percentage of reads with adapters. Illumina basecalling may clean adapters automatically, but I've found it will leave some significant leftovers.

On another note, to keep the forum tidy you should open new questions instead of asking here, this area is for answers only. Follow up like this one could have been asked on the "comments" above.

ADD REPLY
1
Entering edit mode
8.5 years ago
h.mon 35k

I guess your data quality is encoded with PHRED+33, and you are passing a flag (-phred64) telling Trimmomatic it is encoded on PHRED+64.

ADD COMMENT
0
Entering edit mode

That's a good tip, I thought 64 was more recent but I was wrong... Tx

ADD REPLY
0
Entering edit mode

The Wikipedia page on the Fastq format is a good read.

ADD REPLY
0
Entering edit mode

I also came across the same problem, it was indeed the case. Thanks.

ADD REPLY

Login before adding your answer.

Traffic: 2656 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6