Question: bcl2fastq creates only Undetermined fastq
gravatar for e.rempel
23 months ago by
Germany, Heidelberg
e.rempel1000 wrote:

Hi everyone,

I have a question regarding the usage of bcl2fastq for an Agilent panel. The manufacture suggests in this guide to use the parameter "---use-bases-mask" with the specification Y*,I8,Y10,Y*.

Bcl2fastq throws in this case the error:

            UseBasesMask formatting error. A mask must be specified for each read. Number of reads: 3.

Indeed, according to RunInfo.xml there are 3 reads (with lengths 151, 8 (barcode), and 151 accordingly). If we specify three lengths in bcl2fastq call, e.g

            …  ---use-bases-mask Y*,I8,Y*

Then there is no error message and the program creates fastq files. Unfortunately they all are “Undetermined” (we obtain the same outputs if we don’t specify the –use-bases-mask parameter at all). That means imho that the bcl2fastq had problems with the extracting the barcode indices from the SampleSheet.csv data. We have edited the SampleSheet.csv according to the suggestion from the abovementioned guide: “ … [clear] the content in the “I5_index_ID” and “index2” columns”.

Any suggestions?

sequencing software error • 2.0k views
ADD COMMENTlink modified 23 months ago by GenoMax96k • written 23 months ago by e.rempel1000

Hi e.rempel, You can't demultiplex data you do not have. If you did not sequence a 10bp molecular barcode the Y10 could not work.

Maybe an example from your SampleSheet.tsv would help understanding your problem. I don't understand what you want to say by "ALL fastq files Undetermined". Usually you should get 2 Undetermined....fastq.gz in /Data/Intensities/BaseCalls. One for the forward and one for the reverse reads where all your reads with IDs not found in your SampleSheet.tsv go. The fastqs you want, should be at /Data/Intensities/Basecalls/<sample_project>/<sample_id>. If you have problems with demultiplexing, having a look at /Data/Intensities/Basecalls/index.html usually helps for debugging. In /Data/Intensities/BaseCalls/Stats/DemuxSummary...txt (after "Most Popular Unknown Index Sequences") you can see even more indexes found in your data, which could not be mapped to a sample.

ADD REPLYlink written 23 months ago by crisime190
gravatar for GenoMax
23 months ago by
United States
GenoMax96k wrote:

I think an error was made in the way these samples were run.

If these are indeed HaloPlex libraries then you should have run them as 2-D indexes. HaloPlex method needs index 2 recovered as a separate file. NNNNNNN shown in place of index 2 in SampleSheet.csv just means that the sequence there is variable (but it has to be sequenced as index 2, which seems to be completely missing from this run).

Demultiplexing is done using index 1 alone and a separate index file for index 2 is created as a part of bcl2fastq demultiplexing. You end up with an odd looking set of files: R1 --> Read 1, R2 --> Index 2, R3 --> Read 2 files.

If these are NOT HaloPlex samples then I would start using the code I have in this answer (C: Demultiplexing reads with index present in the labels ) to see what sequencer sequenced as the index 1 (as opposed to what you provided in the SampleSheet.csv) and we can go from there.

ADD COMMENTlink modified 23 months ago • written 23 months ago by GenoMax96k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2402 users visited in the last hour