ask for a CWL script to perform FASTQC to get quality control files, then use the folder of having these quality control files as input for MultiQC
1
0
Entering edit mode
4.5 years ago

I am new to CWL, I am looking for a CWL script to perform FASTQC to get quality control files, then use the folder of these files as input for MultiQC program to get the summary files. Does anyone can help me on this?

Thank you

CWL • 1.4k views
ADD COMMENT
1
Entering edit mode

You should attempt this yourself and then show us what you tried.

The purpose of the forum is not to simply do people's work for them, or to give them code. If we just gave you a script, you wouldn't learn anything for yourself?

ADD REPLY
0
Entering edit mode

I tried the following code, and got some errors, so I need a working example to figure out what how to put 2 CommandLineTool tools together.

Thank you,

cwlVersion: v1.0
class: Workflow

requirements: SubworkflowFeatureRequirement: {} ScatterFeatureRequirement: {} StepInputExpressionRequirement: {} InlineJavascriptRequirement: {}

inputs: fastq1: type: File fastq2: type: File?

outputs: type: array items: File outputSource: multiqc/outs

steps: qc_raw: doc: fastqc - quality control for trimmed fastq run: "fastqc.cwl" in: fastq1: source: fastq1 fastq2: source: fastq2 out: - fastqc_zip - fastqc_html multiqc: doc: run multiqc run: "multiqc.cwl" in: source: qc_raw/{fastqc_zip,fastqc_html} out: [outs]

//  here is fastqc.cwl
cwlVersion: v1.0
class: CommandLineTool
requirements:
  InlineJavascriptRequirement: {}
  StepInputExpressionRequirement: {}
hints:
  ResourceRequirement:
    coresMin: 1
    ramMin: 5000
  DockerRequirement:
    dockerPull: kerstenbreuer/trim_galore:0.4.4_1.14_0.11.7

baseCommand: "fastqc"
arguments: 
  - valueFrom: $(runtime.outdir)
    prefix: "-o"
    # specifies output directory
  - valueFrom: "--noextract"
    # reported file will be zipped

inputs:
  fastq1:
    type: File?
    inputBinding:
      position: 1
  fastq2:
    type: File?
    inputBinding:
      position: 2
  bam:
    type: File?
    inputBinding:
      position: 1

outputs:
  fastqc_zip:
    doc: all data e.g. figures
    type:
      type: array
      items: File
    outputBinding:
      glob: "*_fastqc.zip"
  fastqc_html:
    doc: html report showing results from zip
    type:
      type: array
      items: File
    outputBinding:
      glob: "*_fastqc.html"

// here is multiqc.cwl
cwlVersion: v1.0
class: CommandLineTool
requirements:
  InlineJavascriptRequirement: {}
  StepInputExpressionRequirement: {}
  InitialWorkDirRequirement:
    # This step is necessary since the input files
    # must be loaded into the working directory as there
    # is no way to specify the input file directly on the
    # command line.
    listing: |
      ${
        var qc_files_array = inputs.qc_files_array;
        var qc_files_array_of_array = inputs.qc_files_array_of_array;
        var output_array = [];

    if ( qc_files_array != null ){
      for (var i=0; i<qc_files_array.length; i++){
        output_array.push(qc_files_array[i])
      }
    }

    if ( qc_files_array_of_array != null ){
      for (var i=0; i<qc_files_array_of_array.length; i++){ 
        for (var ii=0; ii<qc_files_array_of_array[i].length; ii++){
          output_array.push(qc_files_array_of_array[i][ii])
        }
      }
    }

    return output_array
  }

hints: ResourceRequirement: coresMin: 1 ramMin: 10000 DockerRequirement: dockerPull: kerstenbreuer/multiqc:1.7

baseCommand: ["multiqc"] arguments: - valueFrom: --zip-data-dir position: 1 - valueFrom: "'log_filesize_limit: 100000000'" position: 1 prefix: --cl_config - valueFrom: $(runtime.outdir) position: 2 prefix: --outdir - valueFrom: $(runtime.outdir) position: 4

inputs: qc_files_array: doc: | qc files which shall be part of the multiqc summary; optional, only one of qc_files_array or qc_files_array_of_array must be provided type: - "null" - type: array items: File qc_files_array_of_array: doc: | qc files which shall be part of the multiqc summary; optional, only one of qc_files_array or qc_files_array_of_array must be provided type: - "null" - type: array items: type: array items: File report_name: doc: name used for the html report and the corresponding zip file type: string default: multiqc inputBinding: valueFrom: $(self + "_report") prefix: --filename position: 3

outputs: multiqc_zip: type: File outputBinding: glob: $(inputs.report_name + "_report_data.zip") multiqc_html: type: File outputBinding: glob: $(inputs.report_name + "_report.html")

ADD REPLY
0
Entering edit mode
4.5 years ago
Phil Ewels ★ 1.4k
# This step is necessary since the input files
# must be loaded into the working directory as there
# is no way to specify the input file directly on the
# command line.

Note that you can give specific file paths to MultiQC if you want to, it doesn't _have_ to be a directory. The following should work fine:

multiqc /path/to/sample_1_fastqc.zip /path/to/sample_2_fastqc.zip

If there are a large number of files, you can also use the --file-list option. See https://multiqc.info/docs/#choosing-where-to-scan

ADD COMMENT

Login before adding your answer.

Traffic: 2660 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6