Question: Perform one basecommand fot multiple files and get output for each file in CWL
0
gravatar for mariafirulevabio
5 weeks ago by
mariafirulevabio30 wrote:

Dear all,

I have a Dockerfile image with gzip utilite:

FROM alpine

RUN apk update && apk add gzip

ENTRYPOINT ["unzip"]

Consider I have two fastq.gz files and I want to unzip it using my image.

cwlVersion: v1.0
class: CommandLineTool
label: "gzip wrapper"

baseCommand: [unzip, -p]

requirements:
  - class: InlineJavascriptRequirement
  - class: DockerRequirement
    dockerImageId: gzip_wrapper
    dockerFile: 
      $import: Dockerfile
inputs:
  forward:
    type: File
    inputBinding:
      position: 0
  reverse:
    type: File
    inputBinding:
      position: 1
  output_file_name: string?

outputs:
  extracted_fastq: stdout

How can I define stdout for my purpose? Can I perform a certain base command for array of files and receive array of result files?

Many thanks.

pipeline workflow cwl • 154 views
ADD COMMENTlink modified 5 weeks ago by Tom340 • written 5 weeks ago by mariafirulevabio30
2
gravatar for Tom
5 weeks ago by
Tom340
Bielefeld University, CeBiTec, Germany
Tom340 wrote:

Hello fellow pipeline enthusiast!

To my knowledge, it is not possible to do what you ask using only a command line tool and stdout. Two alternate solutions come to mind:

  1. Change your command line tool so it will only unzip and return a single file (or use this one). Embed said command line tool in a workflow and use WorkflowStepScatter to scatter over an array of zipped files and receive an array of unzipped files.

  2. Let unzip output decompressed files instead of passing everything to stdout. Then return those files using glob.

Personally, i would prefer solution number one. It seems more in line with the cwl ideal of keeping everything simple and modular.

Don't hesitate to ask further questions if there are any problems with implementing this.

ADD COMMENTlink modified 5 weeks ago • written 5 weeks ago by Tom340
1

Dear Tom,

Thanks for the advice.

I have decided to realize the first strategy from your list.

As a result, I have: 1) Unzip wrapper:

cwlVersion: v1.0
class: CommandLineTool
label: "unzip wrapper"

baseCommand: [gunzip, -c]
stdout: $(inputs.infile.nameroot)

requirements:
  - class: DockerRequirement
    dockerImageId: gzip_wrapper
    dockerFile: 
      $import: Dockerfile

inputs:
  infile:
    type: File
    inputBinding:
      prefix: --file

outputs:
  outfile:
    type: File
    outputBinding:
        glob: $(inputs.infile.nameroot)

2) Workflow:

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: Workflow

requirements:
  -class: ScatterFeatureRequirement

inputs:
  in_files: File[]

outputs:
  compiled_class:
    type: File
    outputSource: unzip/outfile

steps:
  unzip:
    scatter: infile
    run: unzip.cwl
    in:
      infile: in_files
    out: [outfile]

3) YAML file:

in_files: 
  - input/test_R1.fastq.gz
  - input/test_R2.fastq.gz

Unfortunately, I have received the error message:

INFO /home/maria/miniconda3/bin/cwltool 1.0.20190607183319
INFO Resolved 'workflow.cwl' to 'file:///home/maria/Bioinformatics/CWL/workflow.cwl'
ERROR Tool definition failed validation:
mapSubject '-class' value 'None' is not a dict and does not have a mapPredicate.

What's wrong with my implementation? Hope you can help me.

Best wishes, Maria.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by mariafirulevabio30
1

Hi Maria,

The workflow is missing a space character between "-" and "class" in the requirements section. Also, since the unzip step uses scatter, it will return an array as output. For this reason the compiled_class entry in the outputs section of the workflow needs to be of type: File[].

Cheers, Tom

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Tom340
1

I'm grateful for your assistance, Tom!

I've edited my workflow using your remarks, now it works. :) After that, I've changed my final workflow implementation: steps are presented by unzip_forward, unzip_reverse and target_tool_for_unzipped_reads, and inputs are gzipped files.

ADD REPLYlink written 4 weeks ago by mariafirulevabio30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 736 users visited in the last hour