Question: Does adapter removal (with trimmomatic) also remove barcodes?
gravatar for salamandra
22 days ago by
salamandra130 wrote:

1- When we remove adapters with trimmomatic for example, are we also removing barcodes? Or is there another command to remove barcodes.

2- I heard that some sequencing providers already remove barcodes from their samples before delivering sequence to the client. Is it the case for Illuminia?

3- Does Illuminia remove adapters from reads before providing them to the client?

rna-seq barcodes adapters • 156 views
ADD COMMENTlink modified 21 days ago by swbarnes23.8k • written 22 days ago by salamandra130

1) Trimmomatic removes adapter sequences based on the sequences you provide. Removal of "barcodes" (you probably mean sequencing indices) is called demultiplexing and is not supported by Trimmomatic.

2) Illumina is only the company behind the sequencing technology. It depends on the sequencing center you work with, if they provide demultiplexed files. Typically that is the case. If you download from NCBI or ENA, stuff is (as far as I know) always already demultiplexed.

3) Again, depends on the facility. If you book this service, they might do it. Typically they only demultiplex. Use fastqc to check for adapter content (which I always recommend, not because I do not trust the bioinformaticians at the facilities, but in the end it is you as the analyst who must confirm that the data quality is good, no matter what the facility said).

ADD REPLYlink written 22 days ago by ATpoint5.5k

2) My reads are separated in different files, which might indicate they were de-multiplexed. Does this means the barcodes were removed from reads also, or although reads were split into different files according to sample the barcodes are still in the sequence? In latter case, which tool alows removal of barcode sequences?

3) Is it enough to look at 'adapter content' fastqc? I ask because, in some samples there was no warnings in 'adapter content module', but 'overepresented sequences module' had some sequences called illuminia index 'something'

ADD REPLYlink modified 22 days ago • written 22 days ago by salamandra130

See my comment below. Index sequences (barcodes) are moved to the headers of fastq sequence as a part of demultiplexing process.

It is not enough to just look at FastQC report. You should always scan (and trim) your data with a proper program like or trimmomatic. There can be low level contamination of adapters in your sequence that FastQC can miss. FastQC does not look at every read in the dataset as it does QC (only parts of data are used for various tests and that is generally ok).

ADD REPLYlink modified 22 days ago • written 22 days ago by genomax51k

In case samples are not demultiplexed wich tool can be used to demultiplex?

ADD REPLYlink written 21 days ago by salamandra130

Since you likely don't have access to original flowcell data folder you may need to use: deML or from BBMap suite. You will need to know index sequences association for the samples for BBMap option.

ADD REPLYlink written 21 days ago by genomax51k

What do you mean by barcode, the primer? Most of the time the adapters are already trimmed. This is done during the basecalling/demultiplexing of raw data. Terminology is always confusing, also not sure if I use the right words now.

ADD REPLYlink modified 22 days ago • written 22 days ago by gb160

mean, the sequences that identify the different samples

ADD REPLYlink written 22 days ago by salamandra130

Ah oke clear. If you got them back as seperated files you can open a file and check if all te sequences start with the same bases.

ADD REPLYlink written 22 days ago by gb160

I would say yes. In the manual it says "ILLUMINACLIP: Cut adapter and other illumina-specific sequences from the read" so I assume also the nextera labels etc. Manual:

But it easy to check for yourself. Just run trimmomatic on a subsample and see if the everything is trimmed off that you wanted to be trimmed off.

ADD REPLYlink written 22 days ago by gb160

Index sequences (or barcodes) are not the same thing as adapters. Index sequences are always read independently in Illumina tech and are never part of the main reads. ILLUMINACLIP is cutting adapter sequences.

ADD REPLYlink modified 22 days ago • written 22 days ago by genomax51k
gravatar for swbarnes2
21 days ago by
United States
swbarnes23.8k wrote:

No one can answer this without knowing if you did anything custom.

In general, Illumina sample indices are a totally different read. They do not need to be trimmed from the main reads. If anywhere, you will see the index in the read name.

ADD COMMENTlink written 21 days ago by swbarnes23.8k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1419 users visited in the last hour