Question: Does adapter removal (with trimmomatic) also remove barcodes?
0
gravatar for salamandra
12 weeks ago by
salamandra170
salamandra170 wrote:

1- When we remove adapters with trimmomatic for example, are we also removing barcodes? Or is there another command to remove barcodes.

2- I heard that some sequencing providers already remove barcodes from their samples before delivering sequence to the client. Is it the case for Illuminia?

3- Does Illuminia remove adapters from reads before providing them to the client?

rna-seq barcodes adapters • 286 views
ADD COMMENTlink modified 12 weeks ago by swbarnes24.0k • written 12 weeks ago by salamandra170
3

1) Trimmomatic removes adapter sequences based on the sequences you provide. Removal of "barcodes" (you probably mean sequencing indices) is called demultiplexing and is not supported by Trimmomatic.

2) Illumina is only the company behind the sequencing technology. It depends on the sequencing center you work with, if they provide demultiplexed files. Typically that is the case. If you download from NCBI or ENA, stuff is (as far as I know) always already demultiplexed.

3) Again, depends on the facility. If you book this service, they might do it. Typically they only demultiplex. Use fastqc to check for adapter content (which I always recommend, not because I do not trust the bioinformaticians at the facilities, but in the end it is you as the analyst who must confirm that the data quality is good, no matter what the facility said).

ADD REPLYlink written 12 weeks ago by ATpoint7.5k

2) My reads are separated in different files, which might indicate they were de-multiplexed. Does this means the barcodes were removed from reads also, or although reads were split into different files according to sample the barcodes are still in the sequence? In latter case, which tool alows removal of barcode sequences?

3) Is it enough to look at 'adapter content' fastqc? I ask because, in some samples there was no warnings in 'adapter content module', but 'overepresented sequences module' had some sequences called illuminia index 'something'

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by salamandra170
1

See my comment below. Index sequences (barcodes) are moved to the headers of fastq sequence as a part of demultiplexing process.

It is not enough to just look at FastQC report. You should always scan (and trim) your data with a proper program like bbduk.sh or trimmomatic. There can be low level contamination of adapters in your sequence that FastQC can miss. FastQC does not look at every read in the dataset as it does QC (only parts of data are used for various tests and that is generally ok).

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by genomax55k

In case samples are not demultiplexed wich tool can be used to demultiplex?

ADD REPLYlink written 12 weeks ago by salamandra170
1

Since you likely don't have access to original flowcell data folder you may need to use: deML or demuxbyname.sh from BBMap suite. You will need to know index sequences association for the samples for BBMap option.

ADD REPLYlink written 12 weeks ago by genomax55k

What do you mean by barcode, the primer? Most of the time the adapters are already trimmed. This is done during the basecalling/demultiplexing of raw data. Terminology is always confusing, also not sure if I use the right words now.

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by gb320

mean, the sequences that identify the different samples

ADD REPLYlink written 12 weeks ago by salamandra170
1

Ah oke clear. If you got them back as seperated files you can open a file and check if all te sequences start with the same bases.

ADD REPLYlink written 12 weeks ago by gb320

I would say yes. In the manual it says "ILLUMINACLIP: Cut adapter and other illumina-specific sequences from the read" so I assume also the nextera labels etc. Manual: http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/TrimmomaticManual_V0.32.pdf

But it easy to check for yourself. Just run trimmomatic on a subsample and see if the everything is trimmed off that you wanted to be trimmed off.

ADD REPLYlink written 12 weeks ago by gb320
1

Index sequences (or barcodes) are not the same thing as adapters. Index sequences are always read independently in Illumina tech and are never part of the main reads. ILLUMINACLIP is cutting adapter sequences.

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by genomax55k
0
gravatar for swbarnes2
12 weeks ago by
swbarnes24.0k
United States
swbarnes24.0k wrote:

No one can answer this without knowing if you did anything custom.

In general, Illumina sample indices are a totally different read. They do not need to be trimmed from the main reads. If anywhere, you will see the index in the read name.

ADD COMMENTlink written 12 weeks ago by swbarnes24.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 649 users visited in the last hour