Hello, I'm new to Bioinformatics and this is the 1st time I'm using Qiime2. What I'm working on is demultiplexing a paired-end raw data set. Here is an example of what my barcodes.fastq.gz have:
@M01522:221:000000000-BRWK5:1:1101:20002:1874 1:N:0:CCTTGA
CAGTTCAT
+
8-8C@<@F
According to my understanding about Fastq files, the CCTTGA (6 nucleotides) is where the barcode should be. But in my sample-metadata.tsv file, the barcode is CAGTTCAT (8 nucleotides). At the same time, when I export the demultiplex-seqs.qza file after demux to see what my sequence looks like, I found out that my sequences had this form of name (the barcode when directly into the name):
10_CATGTTGT_L001_R1_001.fastq.gz
And inside the file, it looked like this:
@M01522:221:000000000-BRWK5:1:1101:15205:1891 1:N:0:CCTTGA
ACACCCCTTTCAGTTGGGACTCTTTTGTCGTTACCCCCTTAAGAAGCCCCTCCCAACTACGTTCCAGCAGCCGCTGTTACACGTTGTTGTCCCTCTTTTTCCTTATTTATTGTTCGTAAAGTGCTCGTCGTCGGTTCGTTAATTCGTGTTTTAAACCTCCAGGCTCTTCCTTCATTCTCCCCTCCTTTCTTCTGTGACTTGTTT
The visualization of Per-sample sequence counts looks very weird (some sample has very large amount of reads (S17 with 600000+ reads) while some has very few (S6 with 5000+)
I don't understand what exacly what this CCTTGA is and I don't know if there is something wrong with my barcodes or sample-metadata files. Please enlight me! Thank you!