Question: DNA barcode sequences for Qiime2 and meaning of it
gravatar for pple2202
13 months ago by
pple22020 wrote:

Hi, all

I'm newbie for metagenome analysis, Qiime2 and have been studying for days. I've searched a lot about DNA barcode sequence but there are several unsolved questions.

  1. What does exactly DNA barcode sequence mean? As I know, DNA barcode sequences are for species identification so that when their DNA are sequenced, the specific DNA barcode sequences are attached right after the adapter. (5'-Adaptor-barcode sequence-fragment-adaptor-3'). Does this mean all of E.coli's sequenced file(fastq), for example, have same barcode sequence? Then what does the DNA barcode sequence mean used in metagenome sequencing? We don't know what species in sample.

  2. How do I know whether barcode sequence is in fastq sequence file? Like, when I try to follow metagenome analysis process in dissertation and get data from ncbi, there is no barcode sequence file, only fastq sequence files even though the project is metagenome analysis.

  3. It could be simple question. When I search for metagenome analysis in ncbi, there are a number of files named similar(e.g Bulk_soil.1, Bulk_soil.2, Bulk_soil.3 .....). What does this mean? Does this mean soil picked different times or location? Since there is no written detail of them.


ADD COMMENTlink modified 13 months ago by toralmanvar810 • written 13 months ago by pple22020
  1. Actually, barcodes are needed when your running multiple samples in one run of sequencing. These index sequences/barcode sequences used to identify which read is coming from which sample.

  2. Generally, barcode sequences will be in the header of your .fastq file. The last segment of few nucleotides appended to the sequence header will be your barcode for that read. read more from qiimea

  3. It depends on the experimental designing they have done. It could be 3 different places from where they have collected their samples or it could be soil from the same place with 3 different time points, or it could be same soil samples with 3 different treatment.

ADD REPLYlink written 13 months ago by Nitin Narwade420
gravatar for toralmanvar
13 months ago by
toralmanvar810 wrote:

DNA barcode used in metagenome sequencing is not for species identification, instead it is for distinguishing different samples run together in single sequencing run through the process of multiplexing. Means during library preparation, different barcodes (unique 6-8bp sequence) are added to each sample during library preparation and then they are pooled and sequences together (This process is known as multiplexing). Thus it helps in sequencing number of samples together in one go which results in exponential increase in the number of samples analysed in a single run, without drastically increasing sequencing cost and time. So after sequencing is completed, these all samples needs to be separate out. These separation is done by mean of barcode sequence added to each DNA fragment during library preparation, so that each read of particular sample is identified before the data analysis (This process is know as demultiplexing).

When you downloaded the data from NCBI, It was already demultiplexed based on the barcode information they carry in their sequence header.

Thus, when using script in qiime, you can use parameter --barcode_type 'not-barcoded'

ADD COMMENTlink written 13 months ago by toralmanvar810
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1077 users visited in the last hour