Question: Paired-end RNA Seq data: How to deal with unpaired data after trimmomatic
gravatar for aggregatibacter
3.9 years ago by
Bonn, Germany
aggregatibacter140 wrote:

Hi everybody,

what is the best practice to deal with the unpaired data generated by trimming paired-end RNA-Seq data, when only one of the mates makes it through the trimming?

I have seen people recommend to only use the paired data remaining (and ignore the often small unpaired files), but I am afraid to lose crucial data. I could easily process the paired and two unpaired sets per sample separatly

My analysis pipeline is

fastqc - trimmomatic - fastqc - STAR - featureCounts - voom/limma

If trying to use all data, at what point would you recommend to put everything together (and how)?

Many thanks!

rna-seq trimming • 2.7k views
ADD COMMENTlink modified 3.9 years ago by igor11k • written 3.9 years ago by aggregatibacter140

Hi guys,

thanks for the quick replies. The unpaired reverse reads are next to nothing (0.2% or something), the forward unpaired usually more like 2 - 5%. Does this sound normal to you?

ADD REPLYlink written 3.9 years ago by aggregatibacter140

There is no "normal". Ideally you should not have any. But this is biology and you live with what you have :-)

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by genomax85k

If you use BBDuk for trimming paired reads, you will not end up with any singletons, which can make the processing easier. Reads will either be retained as pairs or discarded as pairs. In situations where one read is trimmed down to nothing, the pair is discarded if a minimum length restriction is used. If no limitation is set, the read will be trimmed down to a minimum length of 1bp, so it will still be present and the fastq file will be valid and correctly paired, but it will typically be ignored downstream and only its mate will be used (since 1bp reads don't map).

ADD REPLYlink written 3.9 years ago by Brian Bushnell17k
gravatar for kissaj
3.9 years ago by
United States
kissaj100 wrote:

Chuck it, it is broken. It shouldn't be very much (%-wise). If it is, you have a problem.

ADD COMMENTlink written 3.9 years ago by kissaj100
gravatar for Tao
3.9 years ago by
Tao370 wrote:

If you want to keep them, you might want to put unmapped reads into a separate singleton file and tophat allows singleton input besides pair-end input. Remember you should always keep paired reads in the same order in paired files after QC, because most aligner including tophat recognize the reads pair by their order in files, not by reads ID.

ADD COMMENTlink written 3.9 years ago by Tao370
gravatar for igor
3.9 years ago by
United States
igor11k wrote:

STAR already performs soft-clipping, so you shouldn't need to trim the reads.

ADD COMMENTlink written 3.9 years ago by igor11k

I have primarily decided to use trimmomatic because of an adapter contamination in the raw data after demuxing.

For what it is worth, I decided to go all the way and use the program to trim bad bases, too, bascially using the options from the manual.


Does this seem appropriate to you, or would you rather suggest to limit this to the adapter removal and use STAR to soft clip?

ADD REPLYlink written 3.9 years ago by aggregatibacter140
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1796 users visited in the last hour