Question: CWL join two dimensional file array output
0
gravatar for silverspanch
7 months ago by
silverspanch10
silverspanch10 wrote:

I have a CWL workflow that scatters a list of variables over a nested workflow which returns a file array. The output is then an array of file arrays, but it needs to be flattened for the next step. How do I gather the inputs from one step, and join the multidimensional file array into a single flat array?

scatter cwl workflow • 322 views
ADD COMMENTlink modified 6 months ago by Tom520 • written 7 months ago by silverspanch10

Did you consider using StepInputExpressionRequirement (https://www.commonwl.org/v1.0/Workflow.html#StepInputExpressionRequirement)? You can set a valueFrom using JS expression and transform a step input with $(self[0]) from, e.g. [ [ 1, 2 ] ] to [ 1, 2 ].

ADD REPLYlink modified 6 months ago • written 6 months ago by bogdan.gavrilovic220

Thanks, I didnt know about this functionality.

ADD REPLYlink written 6 months ago by silverspanch10
1
gravatar for Tom
6 months ago by
Tom520
Bielefeld University, CeBiTec, Germany
Tom520 wrote:

I believe there is no built-in functionality for what you are asking.

I would suggest adding an ExpressionTool between the two workflow steps. I just tested the following and it seems to work:

 [in the steps section of your workflow]
   arrayBusiness:
    run:
      class: ExpressionTool
      inputs:
        arrayTwoDim:
          type:
            type: array
            items:
              type: array
              items: File
          inputBinding:
            loadContents: true
      outputs:
        array1d:
          type: File[]
      expression: >
        ${
          var newArray= [];
          for (var i = 0; i < inputs.arrayTwoDim.length; i++) {
            for (var k = 0; k < inputs.arrayTwoDim[i].length; k++) {
              newArray.push((inputs.arrayTwoDim[i])[k]);
            }
          }
          return { 'array1d' : newArray }
        }
    in:
      arrayTwoDim: make2dArray/array2d
    out: [array1d]
   [...]

It took me a while to figure out how cwl would like it's nested arrays described. Make sure to set the type of the output of the previous step as:

type:
  type: array
  items:
    type: array
    items: File

Otherwise cwltool will throw a fit because the output is not compatible with the input of the ExpressionTool above. If you try passing type: File[] between the workflows steps cwl will, during runtime, realize that its actually passing along a nested array to a step where the input is supposed to be an array of files and abort.

ADD COMMENTlink modified 6 months ago • written 6 months ago by Tom520
1

Thanks Tom, this worked great! I havent used the ExpressionTool class before but this is exactly what I was looking for.

ADD REPLYlink written 6 months ago by silverspanch10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1811 users visited in the last hour