Question

Interpretation of Results from trimmomatic

0

Entering edit mode

7.8 years ago

AP ▴ 80

Hi everyone,

I just got my results after I ran trimmomatic command. I had 50bp paired end reads for which I ran the following command

**java -jar /opt/asn/apps/trimmomatic_0.35//Trimmomatic-0.35/trimmomatic-0.35.jar PE -phred33 SL264821_1.fastq.gz SL264821_2.fastq.gz SL264821_1_paired.fastq.gz SL264821_1_unpaired.fastq.gz SL264821_2_paired.fastq.gz SL264821_2_unpaired.fastq.gz ILLUMINACLIP:TruSeq3-PE.fa:2:30:10  LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36**

I got the results like this:

Multiple cores found: Using 2 threads
Using PrefixPair: 'TACACTCTTTCCCTACACGACGCTCTTCCGATCT' and 'GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT'
ILLUMINACLIP: Using 1 prefix pairs, 0 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
Input Read Pairs: 25549917
Both Surviving: 23523510 (92.07%)
Forward Only Surviving: 1134600 (4.44%)
Reverse Only Surviving: 409189 (1.60%)
Dropped: 482618 (1.89%)

I am quite confused on the command SLIDINGWINDOW ans MINLEN that I used and the result I don't know how to interpret this. Am I on the right track? Are my trimmed results ok for next steps. And after this do I have to do fastqc for all the four output files? I am so confused with all of these since I am entirely new to Bioinformatics and it quite overwhleming for me. Any suggestions might help.

Thank you, Ambika Pokhrel

RNA-Seq Trimmomatic Result • 5.3k views

ADD COMMENT • link updated 7.8 years ago by finswimmer 16k • written 7.8 years ago by AP ▴ 80

0

Entering edit mode

First of all, what is your data - mRNA, miRNA, ChIP-seq? Library preparation? Intended insert size?

Anyway, your results seems fine, but it would help to have more details about your data.

ADD REPLY • link 7.8 years ago by h.mon 35k

0

Entering edit mode

Sorry i forgot to mention, my data is mRNA from Fusarium oxysporum

ADD REPLY • link 7.8 years ago by AP ▴ 80

score 1 · Answer 1 · 2017-09-15

Hello,

have you read the manual and know what you are doing, or have you just did copy&paste from somewhere?

What you command is doing is:

Remove adapter sequences (ILLUMINACLIP)
Remove all bases with a basequality lower than 3 beginning from the 5' end (LEADING)
Remove all bases with a basequality lower than 3 beginning from the 3' end (TRAILING)
If the average quality of 4 bases side-by-side falls below 15 trimm the read (SLIDINGWINDOW)
DIscards reads that have less the 36 bases (MINLEN)

So, as you have paired end reads it is necessary to give both fastq file as input. As your trimming can result in discarding whole reads, it is necessary to distinguish the outpout where both reads of the pair survive and those where just one of the two reads survive.

You have to decide by your own whether you take all output files for further downstream analyses. It depends on what you are doing. I personally would just take the paired reads.

fin swimmer