Should adapters for RNASeq be removed before alignment?
3
1
Entering edit mode
8.1 years ago
ddzhangzz ▴ 90

I have paired end RNASeq Data of 8 samples (Illumina Hi - Seq). My question is whether I should trim off the adapters from these reads before to do alignment. They provided these adapters:

a) TruSeq Universal Adapter

5’ AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT 3'

b) TruSeq Adapter, Index Adapter

5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACXXXXXXATCTCGTATGCCGTCTTCTGCTTG 3'

If trimming is necessary, since these are paired end, does it mean I trim the Adapter-a from one of the pair (R1) and trim Adapter-b from another one of the pair? And which specific site 5' or 3' should be trimmed?

Additionally, these reads are 125bp reads, does it mean that tophat setting "--mate-inner-dist" should be specified as 125?

Thanks in advance!

RNA-Seq • 5.6k views
ADD COMMENT
2
Entering edit mode
8.1 years ago
mastal511 ★ 2.1k

The 3' ends of the reads should be trimmed.

Tophat mate-inner-dist is the distance bet the 2 reads of a pair, and depends on the average fragment length, you should obtain an idea of that value from the sequence provider, or whoever did the library prep.

ADD COMMENT
0
Entering edit mode

Thanks @mastal511. Could you have some explanation on why these reads should be trimmed at 3' ends?

ADD REPLY
1
Entering edit mode

In cases where the insert turns out to be short you would have read-through into the adapter on 3'-end. You would want to remove that part before alignment.

ADD REPLY
0
Entering edit mode
8.1 years ago
ablanchetcohen ★ 1.2k

It depends on the size of the inserts relative to the reads length. As explained by genomax2, if the read length is greater than the insert length, you will sequence through the insert into the adapter on the other side of the insert.

At our molecular biology platform, our median fragment size, before adding the adapters is 150 bases. Our read length is 50 bases. Since we never sequence into the adapter on the other side of the insert, trimming is useless. It is not detrimental to perform it on our samples, just a waste of computational resources.

You could check with the technicians who prepared the libraries the size of the fragments before the adapters were added. If you cannot obtain this information, you needn't worry. You can just use FastQC to check for the presence of adapters. You can also estimate the distribution of fragment lengths after alignment with Picard tools. To remove the adapters, you can just run Trimmomatic in paired end mode. Trimmomatic already comes with the adapter sequences.

You should first run FastQC on your samples, which will give you useful quality control information anyway. Then you can run Trimmomatic. If the quality of the bases is good, and there are no adapter sequences present, running Trimmomatic will not improve your alignment results, but will not have too much of a detrimental impact either. If there are adapter sequences present, or many low quality bases, your alignments results will be improved after trimming.

ADD COMMENT
0
Entering edit mode
8.1 years ago
super ▴ 60

The reads need to be trimmed. -m is the distance for setting. you need set -r/--mate-inner-dist parameter when you want to do Tophat.

ADD COMMENT

Login before adding your answer.

Traffic: 2014 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6