Question

Which Trimmomatic adapter file cuts the "Illumina Universal Adapter"?

1

Entering edit mode

3.4 years ago

cg1440 ▴ 60

Hi.

I ran the fastqc quality check on the fastq sequencing file. It seems that I have the "Illumina Universal Adapter" in my reads and I need to trim the adapter sequence using trimmomatic. However, I'm not sure which adapter file should I use.

EDIT: I have paired end reads

enter image description here

trimmomatic fastqc NGS • 13k views

ADD COMMENT • link updated 3.4 years ago by Carlo Yague 8.7k • written 3.4 years ago by cg1440 ▴ 60

score 2 · Answer 1 · 2020-12-12

Hi,

So, when you download Trimmomatic, Trimmomatic comes with a folder that contains some fasta files with Illumina adapters. You can use one of these files to trim off the adapters from your reads. You should provide the file depending on the library prep used for your data. The manual of Trimmomatic says this (see the manual):

Using one of the supplied Fasta Files Illumina adapter and other technical sequences are copyrighted by Illumina,but we have been granted permission to distribute them with Trimmomatic. Suggested adapter sequences are provided for TruSeq2 (as used in GAII machines) and TruSeq3 (as used by HiSeq and MiSeq machines), for both single-end and paired-end mode. These sequences have not been extensively tested, and depending on specific issues which may occur in library preparation, other sequences may work better for a given dataset. As a rule of thumbnewer libraries will useTruSeq3, but this really depends on your service provider.

So, if you use TruSeq2 paired-end library, you should use TruSeq2-PE.fa file. Which library prep did they (sequencing facility) use for your data? (see more in the documentation of Trimmomatic to know how to provide the file of adapters to the tool)

Although since you have identified the over represented sequence with the fastqc report, you can specify the exact sequence (in fasta format - file) that you want to trim to Trimmomatic (please see the manual link provided above - the third page counting from the last page).

Other option is to try to use trim_galore (GitHub repo) that by default tries to find automatically the adapters and trim them off (see the manual). You can also specify the adapter sequence that you want to trim to trim_galore.

I hope this answers your question.

António

score 0 · Answer 2 · 2020-12-12

0

Entering edit mode

3.4 years ago

Carlo Yague 8.7k

In complement to the answer provided by antonioggsousa, you can find all the contaminant sequences used by fasqc here: https://github.com/csf-ngs/fastqc/blob/master/Contaminants/contaminant_list.txt

If you want to build your own contaminant fasta file for trimmomatic, you need to use the reverse complement of the above sequences. So in your case, that would be:

>TruSeq Universal Adapter (reverse complemented)
AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT

ADD COMMENT • link 3.4 years ago by Carlo Yague 8.7k

0

Entering edit mode

Thank you, I will try it

UPDATE: it did not work. I guess i'll stick to TruSeq3-PE-2.fa

ADD REPLY • link 3.4 years ago by cg1440 ▴ 60

0

Entering edit mode

What did not work ? The code crashed or the reads were not properly trimmed ?

ADD REPLY • link 3.4 years ago by Carlo Yague 8.7k

0

Entering edit mode

I still got the same adapter content as the untrimmed fastq file

ADD REPLY • link 3.4 years ago by cg1440 ▴ 60

0

Entering edit mode

Weird, it usually works for me. By the way, the end of the above sequence is actually the same as

>PE1
TACACTCTTTCCCTACACGACGCTCTTCCGATCT
>PrefixPE/1
TACACTCTTTCCCTACACGACGCTCTTCCGATCT

in the TruSeq3-PE-2.fa file. It is possible that in your case, due to difference in library prep, you need the reverse complement of that, which is included in the TruSeq3-PE-2.fa file. So to take the full universal adapter, that would be

> TruSeq Universal Adapter (original)
AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT

ADD REPLY • link 3.4 years ago by Carlo Yague 8.7k