No trimming of reverse reads using Trimmomatic-0.39
1
0
Entering edit mode
2.2 years ago
hgal4248 • 0

UPDATE - I could workaround this issue by using the fastp tool as per Istvan's comment below. Im yet to find the root cause of the trimmomatic issue but will update once solved.

Hi all,

I have been using trimmomatic version 0.39 to trim some paired end reads I retrieved off SRA.

The reads have a high adapter content after a check using fastqc.

The reads themselves were prepared with TruSeq RNA Library Pre Kit v2 and sequenced on the Illumina HiSeq 2000, but unfortunately I can't get more information than that.

I have used the following command:

  for sample in "${arr[@]}"
do
   echo "running trimmomatic for sample $sample"

   time java -jar ./Trimmomatic-0.39/trimmomatic-0.39.jar PE -phred33 -threads 24  ${sample}_1.fastq.gz ${sample}_2.fastq.gz \
  ${sample}_1_paired.fq.gz ${sample}_1_unpaired.fq.gz \
  ${sample}_2_paired.fq.gz ${sample}_2_unpaired.fq.gz \
  ILLUMINACLIP:./Trimmomatic-0.39/adapters/TruSeq3-PE-2.fa:2:30:10 SLIDINGWINDOW:4:15 MINLEN:36

done

The error:

for the forward reads the trimming works fine and the samples come out clean, however there is very little to no adapter content removed from the reverse reads.

an additional piece of information is that fastqc identifies the forward reads as using the 'Illumina univeral adapter' while the reverse reads use the 'Illumina small rna 5' adapter'. I couldn't find any match for the latter adapter in any trimmomatic adapter files or in my troubleshooting. I have tried including the adaptor sequences from here in the adapter fq but it made little difference.

Any information or possible solutions would be greatly appreciated!

trimmomatic Trimming quality control rna-seq • 1.8k views
ADD COMMENT
0
Entering edit mode

did you query the reverse reads with small rna adapter (sequence identified by fastqc)?

ADD REPLY
1
Entering edit mode
2.2 years ago

This file here shows what FastQC considers to be the small RNA 5' adapter

https://github.com/s-andrews/FastQC/blob/master/Configuration/adapter_list.txt

try adding that sequencing into your adapter file (I would make a file that contains only the known adapters)

You could just add that little fragment as well, that should be sufficient as the match only needs to be about 5bp long to reliably identify it.

ADD COMMENT
1
Entering edit mode

GATCGTCGGACT (small RNA 5' adapter from fastqc list) is complementary of end bases of 5′ GUUCAGAGUUCUACAGUCCGACGAUC3' (RA5 adapter). Since user has used RA5 adapter in trouble shooting, does OP have to include fastqc adapter (5' GATCGTCGGACT 3') in trimmomatic adapter list? Probably OP needs to fiddle with params to work.

ADD REPLY
1
Entering edit mode

yes, this adapter subject is seemingly never properly explained by the instrument vendor,

adapters are sort of of "touch subject" for most vendors, a mix of trade secret vs common data analysis information

As far as I know, when FastQC detects an adapter, it detects that very sequence, there is no reverse complementing etc.

I always recommend that people run a grep and see the counts and layout:

cat data.fq | egrep GATCG... --color=ALWAYS | head

will clearly show where the matches are and how allows us to see what comes past the match. One can then explore and vary the length of the matches, and inspect the output. Finally, count the affected lines:

cat data.fq | egrep GATCG... | wc -l

I found this approach the most reliable in figuring out just what exactly is located on the end of each read.

Also a new tool I have been warming up to is fastp it will autodetect adapters and generate a quick quality report as well during runtime (not to mention it is much faster than trimmomatic)

https://github.com/OpenGene/fastp

ADD REPLY
1
Entering edit mode

I think query is not about fastqc. It's about trimmomatic not being able to trim sequence from reverse reads. My understanding was that trimmers (fastq trimmers, in general) do the trimming for both adapter sequence and it's complement sequence in PE reads.

ADD REPLY
0
Entering edit mode

yes, maybe I was unclear, regarding FastQC all I I wanted to point out is the way to figure out what is that FASTQC finds as an adapter is to look at the file:

https://github.com/s-andrews/FastQC/blob/master/Configuration/adapter_list.txt

this file also exists in the local FastQC distribution and one can add additional sequences to it to have FastQC report those as well.

This gave me some impetus to put in some work to figure this out for a tutorial once and for all, do we need both adapters?

Some adapters like the Illumina universal adapter and possibly Nextera are most present in a single orientation in both read1 and read2. The adapters are self-revcomplementary at start/end. But other adapters if those are added before the sequencing adapters, may be present in the reads both forward and reverse complemented and hence, most likely will need to be entered both ways.

But then, to complicate matters, what FastQC shows is just the fragment for some adapters like the universal adapter, so we could not even reverse complement that even if we wanted. The small RNA adapter however may be the full adapter, in which case the OP would need to put both the adapter and its reverse complement into the file for Trimmomatic to work. But I don't know that for a fact, hence some testing would be good.

ADD REPLY
0
Entering edit mode

Thanks for your reply Istvan!

I added the sequences from the link you gave to the adapter files from trimmomatic. The tool recognised the sequences however my reverse reads were still left untrimmed.

As per your comment below I decided to give fastp a go and it worked great. It autodetected the adapter sequences for both forward and reverse and trimmed them well (from a peak of ~35% AC to <0.5% for all samples). Will definitely use the tool again.

I still haven't found the underlying cause for my issues with trimmomatic but i'll take some of the troubleshooting suggestions from below and update if I find the root cause.

ADD REPLY
0
Entering edit mode

without posting data, it is difficult to second guess. Though fastp trimming worked, it is better to know how to trim.

ADD REPLY

Login before adding your answer.

Traffic: 1638 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6