bcl2fastq: Could not parse the CSV stream text
2
0
Entering edit mode
3.5 years ago

Also posted on bioinformatics stackexchange.

I am trying to run bcl2fastq to generate fastq files from the bcl ones that I got for 10X single cell experiment run. I am getting the following exception when I am trying to run the bcl2fastq:

https://ibb.co/dOuSxH

For that I am using the following bash script, generate_fastq.sh that I made myself:

 #!/bin/bash

FLOWCELL_DIR="/scratch/nv4e/kipnis/180403_NB501830_0158_AHN3LLBGX5"
OUTPUT_DIR="/scratch/nv4e/kipnis/fastq"
INTEROP_DIR="/scratch/nv4e/kipnis/180403_NB501830_0158_AHN3LLBGX5/InterOp"
SAMPLE_SHEET_PATH="/scratch/nv4e/kipnis/sample_sheet.csv"

bcl2fastq --use-bases-mask=Y26,I8,Y98 --create-fastq-for-index-reads --minimum-trimmed-read-length=8 --mask-short-adapter-reads=8 --ignore-missing-positions --ignore-missing-controls --ignore-missing-filter --ignore-missing-bcls -r 6 -w 6 -R ${FLOWCELL_DIR} --output-dir=${OUTPUT_DIR} --interop-dir=${INTEROP_DIR} --sample-sheet=${SAMPLE_SHEET_PATH}


So, apparently something is wrong with my sample sheet. I looked into RunInfo.xml and there I see 3 reads:

https://ibb.co/h2WqHH

I used the sample sheet generator: https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/bcl2fastq-direct

and got the following file, sample_sheet.csv:

 [Header]
EMFileVersion,4


[Reads]
26
8
98


  [Data]
Lane,Sample_ID,Sample_Name,index,Sample_Project
1,SI-GA-B4_1,17R,ACTTCATA,Chromium_20180406
1,SI-GA-B4_2,17R,GAGATGAC,Chromium_20180406
1,SI-GA-B4_3,17R,TGCCGTGG,Chromium_20180406
1,SI-GA-B4_4,17R,CTAGACCT,Chromium_20180406
1,SI-GA-B5_1,19RL,AATAATGG,Chromium_20180406
1,SI-GA-B5_2,19RL,CCAGGGCA,Chromium_20180406
1,SI-GA-B5_3,19RL,TGCCTCAT,Chromium_20180406
1,SI-GA-B5_4,19RL,GTGTCATC,Chromium_20180406
2,SI-GA-B4_1,17R,ACTTCATA,Chromium_20180406
2,SI-GA-B4_2,17R,GAGATGAC,Chromium_20180406
2,SI-GA-B4_3,17R,TGCCGTGG,Chromium_20180406
2,SI-GA-B4_4,17R,CTAGACCT,Chromium_20180406
2,SI-GA-B5_1,19RL,AATAATGG,Chromium_20180406
2,SI-GA-B5_2,19RL,CCAGGGCA,Chromium_20180406
2,SI-GA-B5_3,19RL,TGCCTCAT,Chromium_20180406
2,SI-GA-B5_4,19RL,GTGTCATC,Chromium_20180406
3,SI-GA-B4_1,17R,ACTTCATA,Chromium_20180406
3,SI-GA-B4_2,17R,GAGATGAC,Chromium_20180406
3,SI-GA-B4_3,17R,TGCCGTGG,Chromium_20180406
3,SI-GA-B4_4,17R,CTAGACCT,Chromium_20180406
3,SI-GA-B5_1,19RL,AATAATGG,Chromium_20180406
3,SI-GA-B5_2,19RL,CCAGGGCA,Chromium_20180406
3,SI-GA-B5_3,19RL,TGCCTCAT,Chromium_20180406
3,SI-GA-B5_4,19RL,GTGTCATC,Chromium_20180406
4,SI-GA-B4_1,17R,ACTTCATA,Chromium_20180406
4,SI-GA-B4_2,17R,GAGATGAC,Chromium_20180406
4,SI-GA-B4_3,17R,TGCCGTGG,Chromium_20180406
4,SI-GA-B4_4,17R,CTAGACCT,Chromium_20180406
4,SI-GA-B5_1,19RL,AATAATGG,Chromium_20180406
4,SI-GA-B5_2,19RL,CCAGGGCA,Chromium_20180406
4,SI-GA-B5_3,19RL,TGCCTCAT,Chromium_20180406
5,SI-GA-B5_4,19RL,GTGTCATC,Chromium_20180406


What is wrong with my .csv? What am I doing wrong?

sequencing fastq bcl • 3.3k views
1
Entering edit mode
3.5 years ago
GenoMax 107k

Use the cellranger mkfastq method shown in my previous post to demultiplex the data: C: scRNA-seq data processing from 10X device

cellranger mkfastq --id=my_id \
--run=/path/to/illumina_data_folder \
--csv=samplesheet.csv


This samplesheet is not exactly in the format that bcl2fastq uses but will work with cellranger.

0
Entering edit mode

cellranger generated the same error: Could not parse the CSV stream text:

Here is more detailed error from _stderr generated file: https://ibb.co/bxiJ4x

0
Entering edit mode

Looks like it is the carriage return/line feed difference. You can use the dos2unix file.csv to convert CRLF to LF. If dos2unix is not on your system then you would know what to do.

2
Entering edit mode

I just got the same error with bcl2fastq on my own project...someone thought it was clever to spell 'naive' with a diaeresis. Since the visible characters look okay in what you posted, it must be a white space character, as Genomax suggested.

0
Entering edit mode

There is no dos2unix installed and I tried to use tr -d '\r' < input > output and perl -pi -e 's/\r\n/\n/g' input from the following thread:

https://unix.stackexchange.com/questions/277217/how-to-install-dos2unix-on-linux-without-root-access

But the error stays the same.

0
Entering edit mode

26
98
98


That should be

26
98


correct?

0
Entering edit mode

No, it should be

26
8
98


I have 3 reads. I do not understand why _stderr file is showing that because I am feeding the correct file in it. That seems very weird for me.

0
Entering edit mode

Why are you modifying the output from the official sample sheet generator?

It should be:

[Reads]
26
98


0
Entering edit mode

I changed it to two read, not working. Removing all the top thing until [Data] gives sample sheet formatting error. Everything looks good in sample sheet, so either I need to somehow find what is actually wrong in samplesheet, some kind of a parsing, verifying program that would tell which line is wrong or there is something else going wrong.

0
Entering edit mode

Are you able to run the test included in the software (look for the tinyBCL dataset)?

Let's make sure your installation works properly.

0
Entering edit mode

Sure, their sample test works perfect

0
Entering edit mode

Now I am close to getting stumped. So the problem is clearly your samplesheet file. Is there a SampleSheet.csv file in the raw data folder you have. Can you rename it something else to ensure that cellranger is reading the file you made using their tool?

Can you also verify what @swbarnes2 commented on: C: bcl2fastq: Could not parse the CSV stream text

You can also contact 10x tech support to see if they have a solution.

0
Entering edit mode

Oh my god, thank you so much! I cannot tell you how much time this saved me, I actually made this account just in case you see this at some point. This was a spreadsheet that never touched a windows machine save for a microsoft file sharing server which apparently was enough to corrupt it. All of the unix based software I used never saw an issue until bcl2fastq.

1
Entering edit mode
3.5 years ago

Those ^M characters at the end of the lines in your error output...those are whitespace characters. That's probably what's messing up the parser.

0
Entering edit mode

He said he fixed that in one of the comments.