A colleague in the lab asked me to demultiplex a recent NextSeq run. She loaded it with samples prepared from two libraries. One library had single indices and one had dual indices. She also prepared two sample sheets for me to use.
I first ran bcl2fastq as follows with the sample sheet for the single-index samples:
bcl2fastq --no-lane-splitting -R $INPUT_DIR -o $OUTPUT_DIR --sample-sheet $SAMPLE_SHEET
This resulted in paired fastq files (R1 and R2) and two large Undetermined files (I assume these contain the sequences that belong to the dual-index experiment).
I then ran bcl2fastq the same way but with the sample sheet for the dual index samples. However, this time there were no separate fastq files for the different samples, and all reads ended up in the Undetermined files.
My questions are as follows:
- Is running bcl2fastq twice the best approach to demultiplex this run? Is there a way to combine the sample sheets?
- I believe the second bcl2fastq run should have worked. Or is there a different way to indicate dual-index samples to bcl2fastq? I didn't get any error messages, but maybe the sample sheet was malformed since all the reads ended up in the Undetermined file.
- I took a look at the headers in the Undetermined file and noticed that the barcodes in the headers are almost the same as the forward indices in the sample sheet. Is it easiest to just manually pull these sequences apart?
Thanks for any advice!
Thank you for the feedback!
I tried your suggestion but it did not work. bcl2fastq complained about the asterisk symbol after the 6's.
I took a look at my RunInfo.xml file:
Based on this information, and the fact that the indices my colleague used are 8 bases long, I adjusted the command line option you suggested as follows:
Unfortunately, this still put all the reads into the Undetermined output files.
I'd appreciate any further information you might have!
Could you post your sample sheet? Seeing it might help to answer. Thanks,
I put in on Dropbox here: https://www.dropbox.com/s/x2kiuy70ht39u3z/NEBvsKAPA-NEB.csv?dl=0
I haven't used the latest version of bcl2fastq software (2.19), so haven't tested it out. But from the manual, the data section of the sample sheet has the following columns - Lane,Sample_ID,Sample_Name,Sample_Project,Index,Index2.
In your sample sheet these are not given in this order. My guess is that when it looks for indexes in columns 5 and 6 it can't find it from your sample sheet and hence everything gets placed in the Undertermined file. Try modifying the sample sheet as per the manual and rerunning it.
Thank you for taking a look!
I went back to the lab tech and asked her to take a critical look at the indices. She regenerated the file using the Illumina Experiment Manager and now it worked. I believe she messed up the file she gave me the first time around...