Question: Bcl to fastq Conversion Problem
0
gravatar for BioRyder
3.6 years ago by
BioRyder160
India
BioRyder160 wrote:

Hello,

Below are the summary of Bcl to fastq conversion by using bcl2fastq v2.16.0.10 .In Output folder only lane 4 and 5 are having R1 and R2 reads (18 GB ) and remaining all lanes (1,2,3,6,7,8) are having  R1 and R2 reads with zero size. If We are looking into lane summary ,last four columns are showing there were data generated for all lanes. Can any one tell me why the data are missing and reports showing data generated... ?

 

Clusters (Raw) Clusters(PF) Yield (MBases)
3,861,446,400 2,333,759,444 704,795

Flowcell Summary

Lane Summary

Lane Raw data Filtered data
# Clusters % of the
lane
% Perfect
barcode
% One mismatch
barcode
Clusters Yield (Mbases) % PF
Clusters
% >= Q30
bases
Mean Quality
Score
1 482,680,800 100.00 0.00 0.00 315,981,885 95,427 65.46 87.52 37.35
2 482,680,800 100.00 0.00 0.00 294,018,812 88,794 60.91 87.41 37.31
3 482,680,800 100.00 0.00 0.00 306,362,193 92,521 63.47 86.42 37.07
4 482,680,800 100.00 0.00 0.00 312,679,038 94,429 64.78 87.17 37.22
5 482,680,800 100.00 0.00 0.00 269,671,083 81,441 55.87 84.35 36.42
6 482,680,800 100.00 0.00 0.00 302,801,644 91,446 62.73 84.90 36.60
7 482,680,800 100.00 0.00 0.00 218,432,818 65,967 45.25 77.27 34.33
8 482,680,800 100.00 0.00 0.00 313,811,971 94,771 65.01 86.58 37.12
software error • 2.2k views
ADD COMMENTlink modified 3.5 years ago • written 3.6 years ago by BioRyder160

Is everything ending up in the Undetermined_*.fastq.gz files? That would explain the results. This would indicate that your sample sheet is incorrect.

ADD REPLYlink written 3.6 years ago by Devon Ryan90k

Hello Devon Ryan,

There is no Undetermined_*.fastq.gz files in out put directory , Because I have not mentioned any index in Sampleshee.csv to convert bcl to fastq.

Below is the Index file .

[Header],,,,,,,,
IEMFileVersion,4,,,,,,,
Date,1/12/15,,,,,,,
Workflow,GenerateFASTQ,,,,,,,
Application,HISeq FASTQ Only,,,,,,,
Assay,,,,,,,,
Description,,,,,,,,
Chemistry,Default,,,,,,,
,,,,,,,,
[Reads],,,,,,,,
151,,,,,,,,
151,,,,,,,,
,,,,,,,,
[Settings],,,,,,,,
ReverseComplement,0,,,,,,,
Adapter,AGATCGGAAGAGCACACGTCTGAACTCCAGTCA,,,,,,,
AdapterRead2,AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT,,,,,,,
,,,,,,,,
[Data],,,,,,,,
Lane,Sample_ID,Sample_Name,Sample_Plate,Sample_Well,I7_Index_ID,index,Sample_Project,Description
1,PhiX_Sample_Stock,PhiX_Sample_Stock,,,,,PhiX,
2,Spix_44G,Spix_44G,,,,,Spix_Macaw,
3,Spix_45G,Spix_45G,,,,,Spix_Macaw,
4,PhiX_Sample_Stock,PhiX_Sample_Stock,,,,,PhiX,
5,Spix_73G,Spix_73G,,,,,Spix_Macaw,
6,Spix_74G,Spix_74G,,,,,Spix_Macaw,
7,Spix_95G,Spix_95G,,,,,Spix_Macaw,
8,Spix_109G,Spix_109G,,,,,Spix_Macaw,

 

ADD REPLYlink written 3.6 years ago by BioRyder160

Presumably it's hitting an error, check the log file.

ADD REPLYlink written 3.6 years ago by Devon Ryan90k

We haven't switched to v2, but v1 would exhibit this behavior if there were missing .bcl or .stats files. You can add the relevant flags (in v1, I believe it's '--ignore-missing-stats --ignore-missing-bcl') and try again.

ADD REPLYlink written 3.6 years ago by harold.smith.tarheel4.3k
3
gravatar for BioRyder
3.5 years ago by
BioRyder160
India
BioRyder160 wrote:

Hi All,

We have identified the problem.The above mentioned problem is happened due to File format of linux Server. Bcl2fastq Version2 is working properly in XFS file system. But If we are using gpfs file system in Linux server,bcl2fastq V2 is generating partial out put file, missing R1 or both R1 and R2 . We have contacted illumina and informed the same. They are internally checking the issue of Bcl2fast with gpfs file system. 

ADD COMMENTlink modified 3.4 years ago • written 3.5 years ago by BioRyder160

Wow, that's kind of crazy. Thanks for reporting back!

ADD REPLYlink written 3.5 years ago by Devon Ryan90k

What kind of hardware are you using gpfs on? Is this on a cluster that uses a job scheduler?

ADD REPLYlink written 3.5 years ago by genomax67k

It is on a cluster that uses SLURM scheduler. 

ADD REPLYlink written 3.5 years ago by BioRyder160

GPFS is not our favorite either but we have not had gross issues like missing files in bulk. It sounds like bcl2fastq/SLURM both think that the jobs are completing properly but the files are missing on storage. Is the storage hardware fully patched/has latest firmware? Have you noticed this problem with other software/processes?

ADD REPLYlink written 3.5 years ago by genomax67k
0
gravatar for BioRyder
3.5 years ago by
BioRyder160
India
BioRyder160 wrote:

Hello All,

Below is the reply from illumina for the above mentioned GPFS file problem.Hope it will helpful for others 

"As far as I understand GPFS, while supported by CentOS, is not a default FS and the OS needs to be reconfigured to use it. Illumina development teams use CentOS for the development and validation of our Linux based software, though only with standard settings. I will however make some internal enquiries to seen whether the GPFS files system has been tested. Having made these enquiries I can confirm that we have had reports that GPFS does not handle BCL2FASTQ processes very well. This is due the sheer number of small files that need to be loaded and processed".

ADD COMMENTlink written 3.5 years ago by BioRyder160
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1855 users visited in the last hour