CWL: How to parse output of steps in workflow
2
1
Entering edit mode
7.0 years ago

I have a workflow that starts with untarring a file into a directory. In the second step, I need one file from that directory as parameter --genomeDir for my tool.

Currently this is solved like so:

Workflow snippet:

  star:
    run: ../tools/STAR.cwl
    in:
      index: tar/output
      fastq: [TUMOR_FASTQ_1, TUMOR_FASTQ_2]
    out: [output]

Tool snippet:

inputs:
  index:
    type: Directory

(...)

arguments:
  - valueFrom: $(inputs.index.path + "/ref_genome.fa.star.idx")
    position: 0
    prefix: --genomeDir

This works, but seems unnecessarily complex, and it also puts a prefix in the arguments. How do I pass the file directly from the workflow? I'm thinking something along the lines of

index: $((tar/output).path + "/ref_genome.fa.star.idx")

And while we're at it, what if I did want to parse this inside the tool cwl, how do I do it in the inputs?

cwl workflow • 2.9k views
ADD COMMENT
2
Entering edit mode
7.0 years ago

Hello jeltje.van.baren,

Thank you for your question. This is a valid need that isn't well met in CWL 1.0.

There are at least two other options which allow you to keep your STAR CWL description ignorant of the structure of your TAR archive (which I agree is a good idea):

  1. Have your untar step output specific files, not just a whole directory

  2. Use an expression tool to pull out the file you need, as demonstrated by Michael Kotliar in chat https://github.com/SciDAP/workflows/blob/master/expressiontools/get-file-by-name.cwl

I've made a proposal for some enhanced syntax to make this easier in future versions of CWL -- likely after the v1.1 release: https://github.com/common-workflow-language/common-workflow-language/issues/430

To answer your other question about not mixing arguments and inputs:

inputs:
  genomeDir:
    type: Directory
    inputBinding:
      valueFrom: $(self.path)/ref_genome.fa.star.idx
      position: 0  # FYI: if there is a prefix, a position is often unnecessary
      prefix: --genomeDir
ADD COMMENT
0
Entering edit mode
7.0 years ago

This last solution is perfect for my particular problem, thanks!

I fully agree on not untarring whole directories if you need a single file, but this is for a DREAM challenge and I'm only allowed one index.tar.gz for the full workflow. Other steps need other files.

Looking forward to the v1.1 release.

ADD COMMENT
0
Entering edit mode

Great to hear!

FYI: You can have multiple outputs in your untaring step that give names to each of your input files:

[…]

outputs:
  A:
    type: File
    outputBinding:
      glob: fileA.txt
  B:
    type: File
    outputBinding:
      glob: otherfile.csv
  C:
    type: File
    outputBinding:
      glob: important.txt
ADD REPLY

Login before adding your answer.

Traffic: 2518 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6