Staging subdirectories in CWL
1
0
Entering edit mode
5.9 years ago
biokcb ▴ 170

Hi,

Is it possible to create/stage directories via CWL? Say I have a tool which requires an output directory exists before it can run, but the directories should be automatically inferred from the samples being processed. What is the best way to go about this in CWL?

  1. Something built into CWL such as InitialWorkDirRequirement? I tried this, but couldn't quite get it to create the directory, I was just told it does not exist.
  2. Running mkdir as a separate step in the workflow to create the directories?

If it is possible to do within CWL (1), then does someone have a good example they can point me to? Thanks!

CWL • 3.0k views
ADD COMMENT
1
Entering edit mode

Yes, this is possible, here's the tutorial for InitialWorkDirRequirement: http://www.commonwl.org/user_guide/15-staging/

I think you are asking to create one or more directories to enclose each sample? If you could be more specific about what you need to do, that would help us help you :-)

Here is a very complex example where many input Files are collected, renamed, and placed into a directory hierarchy:

https://github.com/EBI-Metagenomics/ebi-metagenomics-cwl/blob/master/workflows/convert-to-v3-layout.cwl Permalink from today: https://github.com/EBI-Metagenomics/ebi-metagenomics-cwl/blob/886df9de6713e06228d2560c40f451155a196383/workflows/convert-to-v3-layout.cwl

But there is a good chance won't need something that complex

ADD REPLY
0
Entering edit mode

Yes your interpretation is correct! Sorry I was trying to not overcomplicate my question. I tried using InitialWorkDirRequirement as such:

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool

baseCommand: sample_tool.sh

requirements:
 - class: InitialWorkDirRequirement
   listing:
     - entry: $(inputs.out_dir)
       writable: true

inputs:
  in_file:
    type: File
    inputBinding:
      position: 1
  out_dir:
    type: Directory
    inputBinding:
      position: 2

outputs:
  out_file:
    type: File
    outputBinding:
      glob: $(inputs.out_dir.basename)/$(inputs.in_file.nameroot).txt

[Errno 2] No such file or directory: /home/path/to/output

and it just told me that the directories did not exist, which made me wonder whether or not it can actually create those directories for you or if they have to exist to be "staged" alongside your inputs. The script itself requires the directories exist before running, but at least in my CWL script it also requires that. Is there a different way I need to set up InitialWorkDirRequirement so that the directories are created?

ADD REPLY
0
Entering edit mode

As an additional note, I've also tried this variation, but I think I am just misunderstanding how this works:

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool

requirements:
 - class: InlineJavascriptRequirement
 - class: InitialWorkDirRequirement
   listing:
    - entry: "$({class: 'Directory', listing: []})"
      entryname: $(inputs.out_dir) #doesn't work as just a string either, would like to determine from input though
      writable: true

baseCommand: sample_tool.sh

inputs:
  in_file:
    type: File
    inputBinding:
      position: 1

  out_dir:
    type: string
    inputBinding:
      position: 2

outputs:
  out_file:
    type: File
    outputBinding:
      glob: $(inputs.out_dir)/$(inputs.in_file.nameroot).txt

since I still get a cwltool.errors.WorkflowException: Expression evaluation error: [Errno 13] Permission denied

Where sample_tool.sh is just

#!/usr/bin/env bash

SAMPLE=$1
DIR=$2

touch $DIR/$SAMPLE
ADD REPLY
3
Entering edit mode
5.8 years ago
biokcb ▴ 170

For future readers of this post I figured out where I went wrong, posting here in case it helps someone that wants to do this (#1 in my original post). My node module simply wasn't loaded and I couldn't actually evaluate JS expression, but was getting a Permission Denied error. This portion in my reply above does in fact work to create a new directory from a string input ($inputs.out_dir).

Sorry for the confusion and thanks Michael Crusoe for the input. :)

requirements:
 - class: InlineJavascriptRequirement
 - class: InitialWorkDirRequirement
   listing:
    - entry: "$({class: 'Directory', listing: []})"
      entryname: $(inputs.out_dir) 
      writable: true
ADD COMMENT
0
Entering edit mode

Are you able to post a working script in it's entirety. I tried putting the above together but it does not work for me

I still get errors

$ env PATH=`pwd`:$PATH cwl-runner io-redirection-cmd.cwl --in_file nicholas.txt --out_dir abc

/Users/u0079711/anaconda3/envs/rnaseq/bin/cwl-runner 1.0.20181217162649 Resolved 'io-redirection-cmd.cwl' to 'file:///Users/u0079711/projects/workflow-languages_git/CWL/io-redirection/take2/io-redirection-cmd.cwl' [job io-redirection-cmd.cwl] /private/tmp/docker_tmp19X9Uk$ sample_tool.sh \ /private/var/folders/qt/j80s9z1d4zxgfzty5kvc0hp0z_5035/T/tmpcQqP_M/stge4aa6191-78a1-4cc1-9bb4-fc52d124e4ba/nicholas.txt \ abc touch: abc//private/var/folders/qt/j80s9z1d4zxgfzty5kvc0hp0z_5035/T/tmpcQqP_M/stge4aa6191-78a1-4cc1-9bb4-fc52d124e4ba/nicholas.txt: No such file or directory Could not collect memory usage, job ended before monitoring began. [job io-redirection-cmd.cwl] Job error: Error collecting output for parameter 'out_file': io-redirection-cmd.cwl:31:7: Did not find output file with glob pattern: '['abc/nicholas.txt']' [job io-redirection-cmd.cwl] completed permanentFail {} Final process status is permanentFail

ADD REPLY
0
Entering edit mode
env PATH=`pwd`:$PATH

will have no effect for most cwl-runners. If you're using cwltool try the following:

env PATH=`pwd`:$PATH cwltool --preserve-environment PATH io-redirection-cmd.cwl --in_file nicholas.txt --out_dir abc

or

env PATH=`pwd`:$PATH cwltool --preserve-entire-environment io-redirection-cmd.cwl --in_file nicholas.txt --out_dir abc
ADD REPLY

Login before adding your answer.

Traffic: 2723 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6