Bcl2fastq conversion according to samples (demultiplexing)
1
0
Entering edit mode
2.5 years ago

hellow,

I am doing the bcl2fastq conversion of my RNA-Seq data (demultiplexing) but i am getting the results according to lane but not by samples.

my code:

sudo  bcl2fastq --input-dir ./Data/Intensities/BaseCalls -R ./ --no-lane-splitting --sample-sheet ./SampleSheet.csv


my samplesheet:

[Header],,,,
Date,2019-04-12,,,
Workflow,GenerateFASTQ,,,
Application,FASTQOnly,,,
Assay,TruSeq,,,
Description,,,,
Chemistry,,,,
,,,,
,,,,
72,,,,
72,,,,
,,,,
,,,,
[Data],,,,
Lane,Sample_ID,Sample_Name,index,Sample_project
1,F0,0_DPI_MOCK,AGTCAA,Anjali
1,F2,2_DPI_MOCK,AGTTCC,Anjali
1,F5,5_DPI_MOCK,ATGTCA,Anjali
1,F7,7_DPI_MOCK,CCGTCC,Anjali
2,VF0,0_DPI_INFECTED,CGATGT,Anjali
2,VF2,2_DPI_INFECTED,TGACCA,Anjali
2,VF5,5_DPI_INFECTED,ACAGTG,Anjali
2,VF7,7_DPI_INFECTED,GCCAAT,Anjali
2,VF9,9_DPI_INFECTED,CAGATC,Anjali
2,VF10,10_DPI_INFECTED,CTTGTA,Anjali
3,F9,9_DPI_MOCK,GTCCGC,Anjali
3,F10,10_DPI_MOCK,GTGAAA,Anjali
3,F0,0_DPI_MOCK,AGTCAA,Anjali
3,F2,2_DPI_MOCK,AGTTCC,Anjali
4,CphiX,CONTROL,,Anjali
5,F5,5_DPI_MOCK,ATGTCA,Anjali
5,F7,7_DPI_MOCK,CCGTCC,Anjali
5,F9,9_DPI_MOCK,GTCCGC,Anjali
5,F10,10_DPI_MOCK,GTGAAA,Anjali
6,VF0,0_DPI_INFECTED,CGATGT,Anjali
6,VF2,2_DPI_INFECTED,TGACCA,Anjali
6,VF5,5_DPI_INFECTED,ACAGTG,Anjali
6,VF7,7_DPI_INFECTED,GCCAAT,Anjali
7,VF9,9_DPI_INFECTED,CAGATC,Anjali
7,VF10,10_DPI_INFECTED,CTTGTA,Anjali
7,VF0,0_DPI_INFECTED,CGATGT,Anjali
7,VF2,2_DPI_INFECTED,TGACCA,Anjali
8,VF5,5_DPI_INFECTED,ACAGTG,Anjali
8,VF7,7_DPI_INFECTED,GCCAAT,Anjali
8,VF9,9_DPI_INFECTED,CAGATC,Anjali
8,VF10,10_DPI_INFECTED,CTTGTA,Anjali


Thankyou

RNA-Seq next-gen bcl2fastq demultiplexing • 1.6k views
0
Entering edit mode

It doesn't help that your columns are in a non-standard order. It's quite likely that that broke things. They should be:

Lane,Sample_ID,Sample_Name,index,Sample_project

0
Entering edit mode

Thankyou Devon,

I have updated my SampleSheet.csv

0
Entering edit mode

What did you get? What did you expect. There is a [data] section header missing. Did you try to find out what's going on from stdout output?

0
Entering edit mode

Dear sklages, I am trying to convert the raw data (BCL files) from Illumina GAIIx into fastq files. I want the fastq files according to my samples, but i am getting the fastq file according to Lane (i.e. 16 fastq files of R1&R2 of each lane). Also i have updated my SampleSheet.csv and rerun the programme.

Thank you.

1
Entering edit mode
2.5 years ago
GenoMax 108k

I am trying to convert the raw data (BCL files) from Illumina GAIIx into fastq files.

That does not sound right. GAIIx was one of the older Illumina sequencers and it never had BCL files. You must surely be using data from a new machine.

If you want to get the sample level files across all lanes, then use identical names in both Sample_ID and Sample_Name. e.g. 0_DPI_INFECTED if that sample is in pool run on all lanes.

6,0_DPI_INFECTED,0_DPI_INFECTED,CGATGT,Anjali

0
Entering edit mode

Sorry if it doesn't sound right.

thank you for the advise I will try this.

But our samples were not pooled, we used biological triplicate for each sample. e.g. 0_DPI_INFECTED have 3 biological replicate in 3 different lane (lane- 2,6 and 7).

Sorry, I am not good in technical terms.

0
Entering edit mode

If your samples weren't pooled you couldn't have more than one on a lane.

0
Entering edit mode

we have our sample in 3 different lanes for e.g. 0_DPI_INFECTED samples are in 3 different lane (lane- 2,6 and 7).

0
Entering edit mode

If those are biological replicates then you would want individual files for them from the three lanes.

Think of it this way. If Sample_1 ran in three lanes as a part of pool (technical replicates) then you can generate a single sample level file for that sample by setting identical Sample_1 name for those three lanes. If Sample_1_Rep1,Sample_1_Rep2,Sample_1_Rep3 ran in three separate lanes you would want to get separate files for those biological replicates.

0
Entering edit mode

Yes, you are correct. This is what i want to do for each sample. Sorry, i was not very much clear in my question.

0
Entering edit mode

Then edit your sample sheet so it looks something like this (just showing one sample below)

2,0_DPI_INFECTED_Rep1,0_DPI_INFECTED_Rep1,CGATGT,Anjali
6,0_DPI_INFECTED_Rep2,0_DPI_INFECTED_Rep2,CGATGT,Anjali
7,0_DPI_INFECTED_Rep3,0_DPI_INFECTED_Rep3,CGATGT,Anjali


--no-lane-splitting is not useful if you don't have technical reps or if your sample did not run in more than one lane.

0
Entering edit mode

I have updated my SampleSheet.csv file:

[Header],,,,
Date,2019-04-12,,,
Workflow,GenerateFASTQ,,,
Application,FASTQOnly,,,
Assay,TruSeq,,,
Description,,,,
Chemistry,,,,
,,,,
,,,,
72,,,,
72,,,,
,,,,
,,,,
[Data],,,,
Lane,Sample_ID,Sample_Name,index,Sample_project
1,F0_rep_1,F0_rep_1,AGTCAA,Anjali
1,F2_rep_1,F2_rep_1,AGTTCC,Anjali
1,F5_rep_1,F5_rep_1,ATGTCA,Anjali
1,F7_rep_1,F7_rep_1,CCGTCC,Anjali
2,VF0_rep_1,VF0_rep_1,CGATGT,Anjali
2,VF2_rep_1,VF2_rep_1,TGACCA,Anjali
2,VF5_rep_1,VF5_rep_1,ACAGTG,Anjali
2,VF7_rep_1,VF7_rep_1,GCCAAT,Anjali
2,VF9_rep_1,VF9_rep_1,CAGATC,Anjali
2,VF10_rep_1,VF10_rep_1,CTTGTA,Anjali
3,F9_rep_1,F9_rep_1,GTCCGC,Anjali
3,F10_rep_1,F10_rep_1,GTGAAA,Anjali
3,F0_rep_2,F0_rep_2,AGTCAA,Anjali
3,F2_rep_2,F2_rep_2,AGTTCC,Anjali
4,CphiX,CphiX,,Anjali
5,F5_rep_2,F5_rep_2,ATGTCA,Anjali
5,F7_rep_2,F7_rep_2,CCGTCC,Anjali
5,F9_rep_2,F9_rep_2,GTCCGC,Anjali
5,F10_rep_2,F10_rep_2,GTGAAA,Anjali
6,VF0_rep_2,VF0_rep_2,CGATGT,Anjali
6,VF2_rep_2,VF2_rep_2,TGACCA,Anjali
6,VF5_rep_2,VF5_rep_2,ACAGTG,Anjali
6,VF7_rep_2,VF7_rep_2,GCCAAT,Anjali
7,VF9_rep_2,VF9_rep_2,CAGATC,Anjali
7,VF10_rep_2,VF10_rep_2,CTTGTA,Anjali
7,VF0_rep_3,VF0_rep_3,CGATGT,Anjali
7,VF2_rep_3,VF2_rep_3,TGACCA,Anjali
8,VF5_rep_3,VF5_rep_3,ACAGTG,Anjali
8,VF7_rep_3,VF7_rep_3,GCCAAT,Anjali
8,VF9_rep_3,VF9_rep_3,CAGATC,Anjali
8,VF10_rep_3,VF10_rep_3,CTTGTA,Anjali


But still not getting the desired result.

My code:

bcl2fastq --input-dir ./Data/Intensities/BaseCalls -R ./ --sample-sheet ./SampleSheet.csv --create-fastq-for-index-reads

0
Entering edit mode

But still not getting the desired result.

We can't read you mind nor see what files you are getting. Can you provide a listing of fastq files (ls -1 *.fastq.gz) and explain why that is not the result you want.

0
Entering edit mode

output of

ls -1 *.fastq.gz

Undetermined_S0_L001_I1_001.fastq.gz
Undetermined_S0_L001_R1_001.fastq.gz
Undetermined_S0_L001_R2_001.fastq.gz
Undetermined_S0_L002_I1_001.fastq.gz
Undetermined_S0_L002_R1_001.fastq.gz
Undetermined_S0_L002_R2_001.fastq.gz
Undetermined_S0_L003_I1_001.fastq.gz
Undetermined_S0_L003_R1_001.fastq.gz
Undetermined_S0_L003_R2_001.fastq.gz
Undetermined_S0_L005_I1_001.fastq.gz
Undetermined_S0_L005_R1_001.fastq.gz
Undetermined_S0_L005_R2_001.fastq.gz
Undetermined_S0_L006_I1_001.fastq.gz
Undetermined_S0_L006_R1_001.fastq.gz
Undetermined_S0_L006_R2_001.fastq.gz
Undetermined_S0_L007_I1_001.fastq.gz
Undetermined_S0_L007_R1_001.fastq.gz
Undetermined_S0_L007_R2_001.fastq.gz
Undetermined_S0_L008_I1_001.fastq.gz
Undetermined_S0_L008_R1_001.fastq.gz
Undetermined_S0_L008_R2_001.fastq.gz


These are according to Lane but I want to get separate files for my samples.

0
Entering edit mode

Are you using Illumina expt manager to make the sample sheet? If not you may want to give that a try (Note: it is a windows only application).

Properly formatted SampleSheet.csv has more than the fields you have specified. They should have the following in Data part (this is a random example from Illumina Expt Manager) :

[Data]
Lane    Sample_ID   Sample_Name Sample_Plate    Sample_Well I7_Index_ID index   Sample_Project

0
Entering edit mode

sorry, but i have confirmed that my data is from Illumina GAIIx sytem. IEM is not compatible with GAIIx. Thankyou

0
Entering edit mode

It's highly likely that whoever told you that was wrong. But in the unlikely event that they're correct you'll have to use an old version of bcl2fastq, since no recent versions are compatible with a GAIIx.

0
Entering edit mode

Are you getting the project directories as well? There should be an Anjali directory with a subdirectory for each library.

0
Entering edit mode

Yes you are right, I will look into them. Thankyou