Question: Trimmomatic did a massive trimming!!!
gravatar for joe
4.2 years ago by
joe0 wrote:

Hello everyone,

I'm new to RNA-seq data analysis..

I downloaded 6 samples (3 for every condition) and I ran QC on them and had to trim the adaptors and poor reads.. what happened is that 3 replicates of one condition received (aggressive trimming) and removed around 15% of reads as an average. while 7% of reads have been thrown from the other condition replicates.

now removal of 15% will affect my downstream analysis, so is there any recommendations to improve the trimming procedure or this normal because of issues in the original row data ??

my trimmomatic command:

java -jar ~/Desktop/Trimmomatic-0.36/trimmomatic-0.36.jar PE -phred33 SRR11771_1.fastq SRR11771_2.fastq TRMD_SRR11771_1_paired.fastq TRMD_SRR11771_1_unpaired.fastq TRMD_SRR11771_2_paired.fastq TRMD_SRR11771_2_unpaired.fastq ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36

my plan is to do exon-centric analysis using tophat and star as aligners and miso, mats and suppa for differential splicing analyses..

Your help is really appreciated.


ADD COMMENTlink modified 14 months ago by genetician201610 • written 4.2 years ago by joe0

15% is not massive (unless you have few reads to begin with) :-)

ADD REPLYlink written 4.2 years ago by genomax92k

Your command conflates adapters and quality, so it's impossible to determine why the reads are being trimmed. I suggest you first do adapter-trimming, and then possibly do quality trimming. In general, quality-trimming decreases the accuracy of alignment, though trimming to a low level like Q6 can sometimes be beneficial.

ADD REPLYlink written 4.2 years ago by Brian Bushnell17k

Thank you all for sharing the useful information with me..

I will take into account the different ideas posted here.

ADD REPLYlink written 4.2 years ago by joe0

in the RNAseq data analysis, You have to be careful to strike a balance between acceptable quality and also minimize the number of discarded reads. it should be noted, all the adapters contamination should be trim. I recommend you 123Fastq which combine FASTQC and trimmomatic in a highly interactive graphical user interface. it also added some improvements to QC modules of FASTQC, added a Kmer-based approach to remove adapters in the trimming, and many other features. try it your own:

ADD REPLYlink written 14 months ago by genetician201610

Instead of posting this in multiple old threads it would be best to post an independent tools post one time. That would be proper way of announcing your software.

ADD REPLYlink modified 14 months ago • written 14 months ago by genomax92k
gravatar for igor
4.2 years ago by
United States
igor11k wrote:

You may not realize this, but STAR already performs soft-clipping (or internal trimming), so pre-trimming may not make much of a difference:

STAR performs the so called local alignment of the read sequence to the genome, as opposed to the end-to-end (semi-global) alignment which is performed by many DNA aligners such as bowtie1. This means that STAR will try to maximize the alignment score by "extending" the alignment towards the end of the reads. However, it will not try to force the "full-length" read alignment from the first to the last base of the read sequence.!msg/rna-star/uyGEc7lPveg/yJY6hmjt7REJ

See some additional discussion on RNA-seq read-trimming here:

ADD COMMENTlink written 4.2 years ago by igor11k

thank you for this insight igor

ADD REPLYlink written 14 months ago by steve2.6k
gravatar for Calvin
4.2 years ago by
Calvin60 wrote:

try different trimming tools and do comparison. Sometimes your raw data has many reads like "NNNNNNNNNNNNNNNNNNN", or some not paired-end reads was transfered to another file. You may want to use tophat 2 instead of tophat. In my option, 15% rubbish removal won't affect downstream analysis bcoz base on your command, it is not likely to lose information in your raw data.

ADD COMMENTlink written 4.2 years ago by Calvin60
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1338 users visited in the last hour