Do you need to define adapter sequences for trimming and QC tools?
1
0
Entering edit mode
6 weeks ago
amy__ ▴ 20

Hi,

I have around 70 samples which have undergone WES.

I have a list of the adapters used for each sample; however, it seems that the sequencing company used lots of different adapters for each sample (I.e they all used different adapters). Originally, I was going to use the adapter sequences as an input for fastp and then run it in parallel. However, because the adapter sequences are all different, I don't think I can do this.

Is there a way to just run fastp on default to find adapters or is it good practice to provide each individual adapter sequence?

enter image description here

Thanks! Amy

adapters fastp illumina trimming • 430 views
ADD COMMENT
1
Entering edit mode

However, because the adapter sequences are all different, I don't think I can do this.

You absolutely can. In Illumina sequencing there is a core sequence at beginning of adapters as @Istvan showed below. So any adapter sequence is always going to be present on 3'-end of reads (unless you have adapterdimers). Scanning/trimming programs identify this sequence and then trim remaining read 3' of that adapter (including it).

ADD REPLY
0
Entering edit mode

Thanks GenoMax and @Istvan! So, would it be okay to use a scanning/trimming program without giving it these adapter sequences because they will already look for this core sequence?

Or would you give fastp the adapters for each sample separately?

Thank you for your patience!! Amy

ADD REPLY
1
Entering edit mode

First I would establish that the adapter does indeed exist.

Many adapters are automatically recognized by fastp and reported in the HTML file that gets generated by default. FastQC also recognizes a number of common adapters and shows them in the report.

Run these tools on a few samples and see what these say.

ADD REPLY
1
Entering edit mode

See also the similar posts in the right hand sidebar ---->, for example:

illumina adapter specifying and removing using fastp

ADD REPLY
0
Entering edit mode

Thank you! Will give this a try

ADD REPLY
1
Entering edit mode

I will put a plug in for bbduk.sh from BBMap suite. It is also easy to use and include a full set of commercially available sequences in the adapters.fa file in resources directory in software bundle.

A guide to use bbduk.sh is available here: https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/bbduk-guide/

ADD REPLY
0
Entering edit mode

Thank you!!

ADD REPLY
2
Entering edit mode
6 weeks ago

If the samples are already split then usually the adapters are also trimmed out. That is the standard operating protocol.

Note how these indices are far into the adapter and there would also be fairly long other adapter sequences present. For example:

CAAGCAGAAGACGGCATACGAGAT[i7]GTCTCGTGGGCTCGG

So even if the adapter were present all you would need is to trim by the start of the sequence CAAGCAGAAGACGGC as that would match all other adapters as well.

ADD COMMENT
1
Entering edit mode

If the samples are already split then usually the adapters are also trimmed out. That is the standard operating protocol.

Not necessarily. Sequencing facilities may simply demultiplex the data and not do any trimming.

ADD REPLY

Login before adding your answer.

Traffic: 1755 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6