Question: ask for a CWL script to perform FASTQC to get quality control files, then use the folder of having these quality control files as input for MultiQC
0
gravatar for aimin.at.work
8 months ago by
aimin.at.work0 wrote:

I am new to CWL, I am looking for a CWL script to perform FASTQC to get quality control files, then use the folder of these files as input for MultiQC program to get the summary files. Does anyone can help me on this?

Thank you

cwl • 381 views
ADD COMMENTlink modified 8 months ago by Phil Ewels560 • written 8 months ago by aimin.at.work0
1

You should attempt this yourself and then show us what you tried.

The purpose of the forum is not to simply do people's work for them, or to give them code. If we just gave you a script, you wouldn't learn anything for yourself?

ADD REPLYlink written 8 months ago by Joe17k

I tried the following code, and got some errors, so I need a working example to figure out what how to put 2 CommandLineTool tools together.

Thank you,

cwlVersion: v1.0
class: Workflow

requirements: SubworkflowFeatureRequirement: {} ScatterFeatureRequirement: {} StepInputExpressionRequirement: {} InlineJavascriptRequirement: {}

inputs: fastq1: type: File fastq2: type: File?

outputs: type: array items: File outputSource: multiqc/outs

steps: qc_raw: doc: fastqc - quality control for trimmed fastq run: "fastqc.cwl" in: fastq1: source: fastq1 fastq2: source: fastq2 out: - fastqc_zip - fastqc_html multiqc: doc: run multiqc run: "multiqc.cwl" in: source: qc_raw/{fastqc_zip,fastqc_html} out: [outs]

//  here is fastqc.cwl
cwlVersion: v1.0
class: CommandLineTool
requirements:
  InlineJavascriptRequirement: {}
  StepInputExpressionRequirement: {}
hints:
  ResourceRequirement:
    coresMin: 1
    ramMin: 5000
  DockerRequirement:
    dockerPull: kerstenbreuer/trim_galore:0.4.4_1.14_0.11.7

baseCommand: "fastqc"
arguments: 
  - valueFrom: $(runtime.outdir)
    prefix: "-o"
    # specifies output directory
  - valueFrom: "--noextract"
    # reported file will be zipped

inputs:
  fastq1:
    type: File?
    inputBinding:
      position: 1
  fastq2:
    type: File?
    inputBinding:
      position: 2
  bam:
    type: File?
    inputBinding:
      position: 1

outputs:
  fastqc_zip:
    doc: all data e.g. figures
    type:
      type: array
      items: File
    outputBinding:
      glob: "*_fastqc.zip"
  fastqc_html:
    doc: html report showing results from zip
    type:
      type: array
      items: File
    outputBinding:
      glob: "*_fastqc.html"

// here is multiqc.cwl
cwlVersion: v1.0
class: CommandLineTool
requirements:
  InlineJavascriptRequirement: {}
  StepInputExpressionRequirement: {}
  InitialWorkDirRequirement:
    # This step is necessary since the input files
    # must be loaded into the working directory as there
    # is no way to specify the input file directly on the
    # command line.
    listing: |
      ${
        var qc_files_array = inputs.qc_files_array;
        var qc_files_array_of_array = inputs.qc_files_array_of_array;
        var output_array = [];

    if ( qc_files_array != null ){
      for (var i=0; i<qc_files_array.length; i++){
        output_array.push(qc_files_array[i])
      }
    }

    if ( qc_files_array_of_array != null ){
      for (var i=0; i<qc_files_array_of_array.length; i++){ 
        for (var ii=0; ii<qc_files_array_of_array[i].length; ii++){
          output_array.push(qc_files_array_of_array[i][ii])
        }
      }
    }

    return output_array
  }

hints: ResourceRequirement: coresMin: 1 ramMin: 10000 DockerRequirement: dockerPull: kerstenbreuer/multiqc:1.7

baseCommand: ["multiqc"] arguments: - valueFrom: --zip-data-dir position: 1 - valueFrom: "'log_filesize_limit: 100000000'" position: 1 prefix: --cl_config - valueFrom: $(runtime.outdir) position: 2 prefix: --outdir - valueFrom: $(runtime.outdir) position: 4

inputs: qc_files_array: doc: | qc files which shall be part of the multiqc summary; optional, only one of qc_files_array or qc_files_array_of_array must be provided type: - "null" - type: array items: File qc_files_array_of_array: doc: | qc files which shall be part of the multiqc summary; optional, only one of qc_files_array or qc_files_array_of_array must be provided type: - "null" - type: array items: type: array items: File report_name: doc: name used for the html report and the corresponding zip file type: string default: multiqc inputBinding: valueFrom: $(self + "_report") prefix: --filename position: 3

outputs: multiqc_zip: type: File outputBinding: glob: $(inputs.report_name + "_report_data.zip") multiqc_html: type: File outputBinding: glob: $(inputs.report_name + "_report.html")

ADD REPLYlink written 8 months ago by aimin.at.work0
0
gravatar for Phil Ewels
8 months ago by
Phil Ewels560
Sweden / Stockholm / SciLifeLab
Phil Ewels560 wrote:
# This step is necessary since the input files
# must be loaded into the working directory as there
# is no way to specify the input file directly on the
# command line.

Note that you can give specific file paths to MultiQC if you want to, it doesn't _have_ to be a directory. The following should work fine:

multiqc /path/to/sample_1_fastqc.zip /path/to/sample_2_fastqc.zip

If there are a large number of files, you can also use the --file-list option. See https://multiqc.info/docs/#choosing-where-to-scan

ADD COMMENTlink written 8 months ago by Phil Ewels560
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1764 users visited in the last hour