Question: Is It Necessary To Remove Adapters In All Orientations When Preprocessing Ngs Data?
gravatar for JacobS
4.9 years ago by
Cleveland, Ohio
JacobS880 wrote:

I'm prepping some NGS Illumina data for downstream analysis. To begin, I want to remove any sequencing/ligating adapters and multiplexing (barcoding) tags. To do this, I am using fastx_clipper, which is part of the FASTX-Toolkit. I've also using Trimmomatic for this in the past.

Example command usage: fastx_clipper -Q33 -a TGGAATTCTCGGGTGCCAAGGAACTCCA-mid_tag_insert-AATCTCGTATGCCGTCTTCTGCTTG -l 14 -M 7 -i Input.fastq -o Output.fastq -v -c

Here is my question... Both of these software packages only scan for a single orientation of the adapter you provide within the Illumina reads. However, I find many sequences in all orientations of the adapter, namely: forward, reverse, forward complement, reverse complement. In the forward orientation, the software detects and trims the adapter in >90% of the reads, but in the other 3 orientations the software only detects are trims adapters in ~5% of the reads.

So, is it possible for the adapters to be found in different orientations than the forward sense, or am I seeing artifacts of non-strict adapter matching? Do people usually trim adapters in every possible orientation? Any other suggestions for successfully handling adapters?


trimming ngs filtering • 5.2k views
ADD COMMENTlink modified 4.9 years ago by Manu Prestat3.8k • written 4.9 years ago by JacobS880
gravatar for Istvan Albert
4.9 years ago by
Istvan Albert ♦♦ 77k
University Park, USA
Istvan Albert ♦♦ 77k wrote:

Seeing an adapter in the forward orientation is the result of a DNA fragment being shorter than the read length, it is a "normal" occurrence in these cases. The Illumina TrueSeq indexed sequencing adapters were designed in such a way that the same adapter sequence will be found on reads coming from both strands.

In that case adapters present in any other orientation most likely indicate a protocol failure, in which case probably the entire read should be removed.

ADD COMMENTlink modified 4.9 years ago • written 4.9 years ago by Istvan Albert ♦♦ 77k

@Istvan Thanks for the answer. This is what I expected, which makes me unsure of the FASTX results. Perhaps my stringency is simply too loose, only requiring 7 sequential adapter bases to be matched. I'll try a few variations and report back later.

ADD REPLYlink modified 4.9 years ago • written 4.9 years ago by JacobS880

also to correct myself some (not all) Illumina adapters are designed to produce the same sequence on both strands

ADD REPLYlink written 4.9 years ago by Istvan Albert ♦♦ 77k

Can you please specify which ones do and which ones don't? Thanks!

ADD REPLYlink written 3.6 years ago by enricoferrero740
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 664 users visited in the last hour