Question: Is there a standard yaml file to describe Illumina runs: sample_name, barcode, lane, flowcell?
0
gravatar for 14134125465346445
2.3 years ago by
United Kingdom
141341254653464453.4k wrote:

Is there a format to describe sample names and their associated flowcell(s), lane(s) and barcode(s) from Illumina sequencing experiments?

The Illumina documentation describes the following notation for multiplex and non-multiplexed runs:

Naming
Illumina FASTQ files use the following naming scheme:
<sample name>_<barcode sequence>_L<lane (0-padded to 3 digits)>_R<read number>_<set number (0-padded to 3 digits>.fastq.gz
For example, the following is a valid FASTQ file name:
NA10831_ATCACG_L002_R1_001.fastq.gz
In the case of non-multiplexed runs, <sample name> will be replaced with the lane numbers (lane1, lane2, ..., lane8) and <barcode sequence> will be replaced with "NoIndex".

And I have seen bcbio has some code and example yaml files to describe some of this, and it seems scilifelab has adopted it:

http://bcbio-nextgen.readthedocs.io/en/latest/contents/configuration.html https://github.com/SciLifeLab/scilifelab/blob/e5f4be45e2e9ff6c0756be46ad34dfb7d20a4b4a/scilifelab/bcbio/flowcell.py

What I am looking for is a standard or something close to a standard that people have adopted for this.

Does anything like this exist? Is Common Workflow Language CWL dealing with this? Galaxy? Genologics?

barcode lane flowcell illumina run • 974 views
ADD COMMENTlink modified 2.3 years ago by Pierre Lindenbaum112k • written 2.3 years ago by 141341254653464453.4k
0
gravatar for Pierre Lindenbaum
2.3 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum112k wrote:

Illumina CASAVA generates some XML files after demultiplexing:

<DemultiplexConfig>
  <Software Version="CASAVA-1.8.2" CmdAndArgs="...">
  <FlowcellInfo ID="C3FGGACXX" Operator="x" Recipe="" Desc="">
    <Lane Number="1">
      <Sample ProjectId="P1" Control="N" Index="CTTGTA" SampleId="S1"  />
      <Sample ProjectId="P2" Control="N" Index="GCCAAT" SampleId="S1" />
(...)
ADD COMMENTlink written 2.3 years ago by Pierre Lindenbaum112k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 999 users visited in the last hour