Question

Trimming paired end RNA-seq with Trimmomatic

0

Entering edit mode

9.8 years ago

samantha_jeschonek ▴ 50

Hello --

I've begun pre-processing of my paired-end RNA seq data (run on Illumina HiSeq).

After running fastqc on my samples, I noticed some have overrepresented sequences corresponding to adaptors.

I've been trying to use Trimmomatic to remove the adaptors, however, after Trimming I get MORE over represented reads than I do before trimming! I'm not sure what's going on.

For instance, in my unprocessed read, I'll have a single overpresented sequence corresponding to adapter index 1. Once trimmed and processed by trimmomatic, I'll have 25 overrepresented sequences, all corresponding to different variants of the adapter index 1 sequence.

Here is my command line:

Code:

TrimmomaticPE -phred33 /R1_001.fastq.gz /R2_001.fastq.gz /R1_pairedout /R1_unpairedout /R2_pairedout /R2_unpairedout ILLUMINACLIP:/TruSeq3-PE.fa:2:30:10 LEADING:5 TRAILING:5 AVGQUAL:20

Any idea what I'm doing wrong? The same thing occurs even if I leave out the ILLUMINACLIP line.

hiseq preprocessing RNA-Seq trimmomatic • 13k views

ADD COMMENT • link updated 2.4 years ago by Ram 43k • written 9.8 years ago by samantha_jeschonek ▴ 50

0

Entering edit mode

Hello samantha_jeschonek!

It appears that your post has been cross-posted to another site: http://seqanswers.com/forums/showthread.php?t=44949

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLY • link 9.8 years ago by Pierre Lindenbaum 161k

0

Entering edit mode

woops, not sure how to delete post so it isn't posted in both places!

ADD REPLY • link 9.8 years ago by samantha_jeschonek ▴ 50

score 0 · Answer 1 · 2014-07-14

You are not actually ending up with more bad reads - it is just that the system is now able to identify more cases that before looked sort of ok.

In general the "overrepresented" sequence measure is not all that accurate. It could be that you have a large number of fused adaptors that when one adaptor gets cut off the next one shows up.