Question

Questions on demultiplexing using Qiime2

0

Entering edit mode

5 weeks ago

Minh • 0

Hello, I'm new to Bioinformatics and this is the 1st time I'm using Qiime2. What I'm working on is demultiplexing a paired-end raw data set. Here is an example of what my barcodes.fastq.gz have:

@M01522:221:000000000-BRWK5:1:1101:20002:1874 1:N:0:CCTTGA
CAGTTCAT
+
8-8C@<@F

According to my understanding about Fastq files, the CCTTGA (6 nucleotides) is where the barcode should be. But in my sample-metadata.tsv file, the barcode is CAGTTCAT (8 nucleotides). At the same time, when I export the demultiplex-seqs.qza file after demux to see what my sequence looks like, I found out that my sequences had this form of name (the barcode when directly into the name):

10_CATGTTGT_L001_R1_001.fastq.gz

And inside the file, it looked like this:

@M01522:221:000000000-BRWK5:1:1101:15205:1891 1:N:0:CCTTGA
ACACCCCTTTCAGTTGGGACTCTTTTGTCGTTACCCCCTTAAGAAGCCCCTCCCAACTACGTTCCAGCAGCCGCTGTTACACGTTGTTGTCCCTCTTTTTCCTTATTTATTGTTCGTAAAGTGCTCGTCGTCGGTTCGTTAATTCGTGTTTTAAACCTCCAGGCTCTTCCTTCATTCTCCCCTCCTTTCTTCTGTGACTTGTTT

The visualization of Per-sample sequence counts looks very weird (some sample has very large amount of reads (S17 with 600000+ reads) while some has very few (S6 with 5000+)

I don't understand what exacly what this CCTTGA is and I don't know if there is something wrong with my barcodes or sample-metadata files. Please enlight me! Thank you!

demultiplexing barcode • 196 views

ADD COMMENT • link 5 weeks ago by Minh • 0