Question: Should adapters for RNASeq be removed before alignment?
1
gravatar for ddzhangzz
3.0 years ago by
ddzhangzz90
United States
ddzhangzz90 wrote:

I have paired end RNASeq Data of 8 samples (Illumina Hi - Seq). My question is whether I should trim off the adapters from these reads before to do alignment. They provided these adapters:

a) TruSeq Universal Adapter

5’ AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT 3'

b) TruSeq Adapter, Index Adapter

5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACXXXXXXATCTCGTATGCCGTCTTCTGCTTG 3'

If trimming is necessary, since these are paired end, does it mean I trim the Adapter-a from one of the pair (R1) and trim Adapter-b from another one of the pair? And which specific site 5' or 3' should be trimmed?

Additionally, these reads are 125bp reads, does it mean that tophat setting "--mate-inner-dist" should be specified as 125?

Thanks in advance!

rna-seq • 2.4k views
ADD COMMENTlink modified 3.0 years ago by super60 • written 3.0 years ago by ddzhangzz90
2
gravatar for mastal511
3.0 years ago by
mastal5112.0k
mastal5112.0k wrote:

The 3' ends of the reads should be trimmed.

Tophat mate-inner-dist is the distance bet the 2 reads of a pair, and depends on the average fragment length, you should obtain an idea of that value from the sequence provider, or whoever did the library prep.

ADD COMMENTlink written 3.0 years ago by mastal5112.0k

Thanks @mastal511. Could you have some explanation on why these reads should be trimmed at 3' ends?

ADD REPLYlink written 3.0 years ago by ddzhangzz90
1

In cases where the insert turns out to be short you would have read-through into the adapter on 3'-end. You would want to remove that part before alignment.

ADD REPLYlink written 3.0 years ago by genomax63k
0
gravatar for ablanchetcohen
3.0 years ago by
ablanchetcohen1.2k
Canada
ablanchetcohen1.2k wrote:

It depends on the size of the inserts relative to the reads length. As explained by genomax2, if the read length is greater than the insert length, you will sequence through the insert into the adapter on the other side of the insert.

At our molecular biology platform, our median fragment size, before adding the adapters is 150 bases. Our read length is 50 bases. Since we never sequence into the adapter on the other side of the insert, trimming is useless. It is not detrimental to perform it on our samples, just a waste of computational resources.

You could check with the technicians who prepared the libraries the size of the fragments before the adapters were added. If you cannot obtain this information, you needn't worry. You can just use FastQC to check for the presence of adapters. You can also estimate the distribution of fragment lengths after alignment with Picard tools. To remove the adapters, you can just run Trimmomatic in paired end mode. Trimmomatic already comes with the adapter sequences.

You should first run FastQC on your samples, which will give you useful quality control information anyway. Then you can run Trimmomatic. If the quality of the bases is good, and there are no adapter sequences present, running Trimmomatic will not improve your alignment results, but will not have too much of a detrimental impact either. If there are adapter sequences present, or many low quality bases, your alignments results will be improved after trimming.

ADD COMMENTlink written 3.0 years ago by ablanchetcohen1.2k
0
gravatar for super
3.0 years ago by
super60
super60 wrote:

The reads need to be trimmed. -m is the distance for setting. you need set -r/--mate-inner-dist parameter when you want to do Tophat.

ADD COMMENTlink written 3.0 years ago by super60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1894 users visited in the last hour