Question: Interpretation of Trimmomatic Results after Paired-End Adapter Trimming
0
gravatar for sevenless
5 months ago by
sevenless0
sevenless0 wrote:

Hi,

I have some questions concerning the output of Trimmomatic after adapter removal. I have 80 bp paired-end reads in Ilumina 1.9 encoding (Phred+33). Using FastQC for quality control, I noticed some overrepresented sequences in the data which were identified as TruSeq adapters. For this reason, I used Trimmomatic in order to trim the adapters and to drop any resulting reads with a length < 36 bp:

java -jar trimmomatic-0.38.jar PE -phred33 seq_1.fastq.gz seq_2.fastq.gz seq_1_trimmed_paired.fastq.gz seq_2_trimmed_unpaired.fastq.gz seq_1_trimmed_paired.fastq.gz seq_2_trimmed_unpaired.fastq.gz ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 MINLEN:36

As a result, I get ~99% of both reads surviving and ~1% forward reads only surviving and 0% reverse reads only surviving:

ILLUMINACLIP: Using 1 prefix pairs, 0 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
Input Read Pairs: 71446282 Both Surviving: 70983555 (99.35%) Forward Only Surviving: 453784 (0.64%) Reverse Only Surviving: 0 (0.00%) Dropped: 8943 (0.01%)

Is the 0% reverse only surviving the expected result? It seems that the reverse reads are the only ones affected by the adapter trimming. However, in the FastQC quality control, the warnings for overrepresented adapter sequences only showed up for the forward reads.

And what is the difference between the TruSeq3-PE.fa and the reverse complements TruSeq3-PE-2.fa adapter sequence files and which of them should actually be used to trim adapters from paired-end reads?

I would be very grateful for any help or explanations.

ADD COMMENTlink modified 5 months ago by mastal5112.0k • written 5 months ago by sevenless0

Honestly, I don't have a clear answer to you on this matter, that's why I post it as a comment and not as an answer. I have some speculations though (maybe they help):

  • as far as I know it is easier to have adapter traces in the reverse reads with truseq kits, so it is probably easier to get forward-only surviving than reverse-only surviving. Since your discard rate is ~1%, it might be by chance that you have no reverse-only.

  • Did you run FastQC before and after the trimming? How does the adapter content plot look like when compared?

ADD REPLYlink written 5 months ago by Macspider2.6k

Thanks for your comment! I don't think it's by chance because I got the same result (0% reverse only surviving) for all my files. mastal511 also explained that "Trimmomatic's default behaviour is to drop the reverse reads when it trims adapters" (see answer below).

ADD REPLYlink written 5 months ago by sevenless0
4
gravatar for mastal511
5 months ago by
mastal5112.0k
mastal5112.0k wrote:

Trimmomatic's default behaviour is to drop the reverse reads when it trims adapters, so you get forward reads only surviving as a result.

The reasoning behind this is that when you read into the adapter sequences it means that the insert is shorter than one of the reads, so the reverse read doesn't add any extra information, it is just the reverse complement of the forward read.

However, the default behaviour can be changed if you want to keep the reverse reads after adapter trimming. See the Trimmomatic manual, you need to add TRUE as the last parameter to ILLUMINACLIP.

ADD COMMENTlink modified 5 months ago • written 5 months ago by mastal5112.0k

Thank you very much for your reply and the explanation about Trimmomatic's default behaviour. I understand now why I get 0% reverse-only surviving.

However, I now ran the FastQC again on the trimmed paired-end reads and strangely enough, the adapters are still reporter as overrepresented sequences, e.g. TruSeq Adapter, index 5 (GATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGC).

Is there another TruSeq adapter file available that I could use for this purpose (e.g. TruSeq3-PE-2.fa) or should I add this sequence to the adapter file manually?

ADD REPLYlink written 5 months ago by sevenless0

There is also the TruSeq2 file, and I usually have to use that one (i.e. most of the data files I trimmed were prepared with a TruSeq2 kit).

ADD REPLYlink written 5 months ago by Macspider2.6k

It seems that TruSeq3-PE-2.fa does the trick. Interestingly, now I get reverse-only surviving reads as well.

ADD REPLYlink modified 5 months ago • written 5 months ago by sevenless0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 954 users visited in the last hour