Qiime2 Demux Error
0
0
Entering edit mode
4.4 years ago
zach ▴ 10

Hello there,

I hope there is someone experienced in Qiime2 who could help me please. I am demultiplexing sequence data with the command below:

qiime demux emp-paired   –m-barcodes-file 54386_mapping_file.txt –m-barcodes-column BarcodeSequence   –p-rev-comp-mapping-barcodes –i-seqs emp-paired-end-sequences.qza    –o-per-sample-sequences demux.qza  –o-error-correction-details demux-details.qza

But I received the error: Plugin error from demux: Barcode header lines do not contain description fields but sequence header lines do.

Although the barcodes.fast.gz is in proper 4 line format, I noticed the 4th line is '@@@@' for the 1st few lines which could have been the potential issue:

@DGZN8DQ1:549:H7C23BCXX:2:1101:1087:1870
CGTCGTATGAAT
+
@@@@@@@@@@@@

I then used 'seqtk' to change the 4th line to '####' to produce these 1st few lines:

@DGZN8DQ1:549:H7C23BCXX:2:1101:1087:1870
CGTCGTATGAAT
+
############

However, I again received the same error about description fields in the sequence but not barcode header lines. How do I identify this problem and rectify it to demux successfully? Thank you in advance for any help!

qiime2 demux • 2.4k views
ADD COMMENT
1
Entering edit mode

Although the barcodes.fast.gz is in proper 4 line format, I noticed the 4th line is '@@@@' for the 1st few lines which could have been the potential issue:

@ is a valid quality score (Q31) for sanger formatted fastq files.

ADD REPLY
0
Entering edit mode

I also noticed another potential and more likely problem. The sequence header is this:

@DGZN8DQ1:549:H7C23BCXX:2:1101:1087:1870 1:N:0:CGTCGTATGAAT

Whereas the barcode header is this:

@DGZN8DQ1:549:H7C23BCXX:2:1101:1087:1870

I think the error is probably because the barcode header lines do not contain the 'description field' of 1:N:0:CGTCGTATGAAT

If this is the problem, is there a way I could add the corresponding description fields to all my barcode headers? Thanks and apologies for the long question.

ADD REPLY
1
Entering edit mode

That is not description field. It is the sequence of the index which is needed to demultiplex the data.

So you have a separate file for the index sequences. You can use this solution to demultiplex the data: A: Demultiplexing Illumina data

ADD REPLY
0
Entering edit mode

Thanks for all the answers, genomax. From the link you provided, I am trying to demux with deML as it seems like a promising solution. If I successfully get the output in fastq format, is there a way to convert the demux-ed document to .qza format? I need it in .qza for Qiime2 donwstream analyses.

I have another question slightly related. If demux-ing with Qiime2, would it work if I simply add the corresponding sequence of the index to every barcode header? If yes, how could I do this with the header of a separate sequence file? I guess it's a similar thing you did with your reply in this forum discussion - http://seqanswers.com/forums/showthread.php?t=74570

eg. Add '1:N:0:CGTCGTATGAAT' to barcode header and the same for other unique samples.

I hope you understand my questions. Thank you again!

ADD REPLY
0
Entering edit mode

You need to actually run Qiime2 analysis to get the .qza files.

As for your second question it may. Qiime2 evolved over years to use different inputs. Take a look at the input section in the tutorial for more.

ADD REPLY
0
Entering edit mode

Regarding my 2nd question, is there a way I could add the sequence of the index (from forward sequence file) to the barcode file headers for each sample?

I am quite new to the Linux system. I am wondering how I could possibly utilise the 'paste' command (?) to solve this problem? I would be grateful if you could help me with this!

ADD REPLY

Login before adding your answer.

Traffic: 3000 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6