Create writable directory inside container before executing baseCommand. How to properly use InitialWorkDirRequirement
1
1
Entering edit mode
4.5 years ago

Hello all

I'm using CWL v1.0 and trying to understand how I can create writeable directory inside output directory of container and then return it to a specific place on my computer. The question came after trying to run STAR in container from inside cwl file.

The general structure of the command, that I want to implement

STAR --runMode genomeGenerate --genomeDir /some/path/to/output/folder [other parameters]


So when I run my cwl and set genomeDir to ./ it works fine, because cwltool mounts output directory as writable and STAR can put results there. Then I use ./ as glob in output parameter and return all the data I need from container to my computer.

Part of cwl file to describe how I set inputs and outputs:

<skipped lines>

inputs:
genomeDir:
type: string
inputBinding:
position: 1
prefix: --genomeDir

<skipped lines>

outputs:
indices:
type: Directory
outputBinding:
glob: $(inputs.genomeDir)  But when I try to set genomeDir to any other folder, for example ./dm3, STAR gives me an error, that I need first to create the dm3 folder. To play a little bit with this issue I created simple cwl file to understand how to solve my problem My cwl file (I followed an example http://www.commonwl.org/v1.0/UserGuide.html#Creating_files_at_runtime) cwlVersion: v1.0 class: CommandLineTool hints: DockerRequirement: dockerPull: ubuntu baseCommand: ls arguments: ["-p"] stdout: output.txt requirements: InitialWorkDirRequirement: listing: - entryname:$(inputs.fileName)
entry: Some text inside the file
- class: Directory
basename: folderName
listing: []

inputs:
dirName:
type: string
fileName:
type: string

outputs:
output:
type: stdout
fileOut:
type: File
outputBinding:
glob: $(inputs.fileName) dirOut: type: Directory outputBinding: glob: folderName  Job file (in this case I don't actually use dirName, because I set it in cwl as string) dirName: new_folder fileName: textfile.txt  When I run it I receive the following error cwl-runner --debug createfile.cwl job.yml /usr/local/bin/cwl-runner 1.0.20160930152149 [job createfile.cwl] initializing from file:///Users/kot4or/workspaces/cwl_ws/sandbox/create_directory/createfile.cwl [job createfile.cwl] { "dirName": "new_folder", "fileName": "textfile.txt" } [job createfile.cwl] path mappings is {} [job createfile.cwl] command line bindings is [ { "position": [ -1000000, 0 ], "datum": "ls" }, { "position": [ 0, 0 ], "datum": "-p" } ] [job createfile.cwl] /var/folders/sd/41rg42_16q72_2yzl_vvgsbw0000gn/T/tmpFDSLKV$ docker \
run \
-i \
--volume=/private/var/folders/sd/41rg42_16q72_2yzl_vvgsbw0000gn/T/tmpFDSLKV:/var/spool/cwl:rw \
--volume=/private/var/folders/sd/41rg42_16q72_2yzl_vvgsbw0000gn/T/tmpbW3Lpe:/tmp:rw \
--workdir=/var/spool/cwl \
--log-driver=none \
--user=501 \
--rm \
--env=TMPDIR=/tmp \
--env=HOME=/var/spool/cwl \
ubuntu \
ls \
-p > /var/folders/sd/41rg42_16q72_2yzl_vvgsbw0000gn/T/tmpFDSLKV/output.txt
[job createfile.cwl] initial work dir {
"_:cf81ebd5-810c-40b0-bb68-040f0322ca40": [
"_:cf81ebd5-810c-40b0-bb68-040f0322ca40",
"/var/folders/sd/41rg42_16q72_2yzl_vvgsbw0000gn/T/tmpFDSLKV",
"Directory"
],
"_:bb364b67-6b01-4b9d-996a-0bc227a24489": [
"_:bb364b67-6b01-4b9d-996a-0bc227a24489",
"/var/folders/sd/41rg42_16q72_2yzl_vvgsbw0000gn/T/tmpFDSLKV/folderName",
"Directory"
],
"_:7be322d4-0e8f-4edd-b852-8a113fdeb5fe": [
"Some text inside the file",
"/var/folders/sd/41rg42_16q72_2yzl_vvgsbw0000gn/T/tmpFDSLKV/textfile.txt",
"CreateFile"
]
}
Error collecting output for parameter 'dirOut'
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/cwltool/draft2tool.py", line 383, in collect_output_ports
ret[fragment] = self.collect_output(port, builder, outdir, fs_access, compute_checksum=compute_checksum)
File "/usr/local/lib/python2.7/site-packages/cwltool/draft2tool.py", line 474, in collect_output
raise WorkflowException("Did not find output file with glob pattern: '{}'".format(globpatterns))
WorkflowException: Did not find output file with glob pattern: '['folderName']'
Error while running job: Error collecting output for parameter 'dirOut': Did not find output file with glob pattern: '['folderName']'
[job createfile.cwl] completed permanentFail
[job createfile.cwl] {}
Final process status is permanentFail
[job createfile.cwl] Removing input staging directory /var/folders/sd/41rg42_16q72_2yzl_vvgsbw0000gn/T/tmpLhLC5b
[job createfile.cwl] Removing temporary directory /var/folders/sd/41rg42_16q72_2yzl_vvgsbw0000gn/T/tmpbW3Lpe
Workflow error, try again with --debug for more information:
Process status is ['permanentFail']
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/cwltool/main.py", line 677, in main
**vars(args))
File "/usr/local/lib/python2.7/site-packages/cwltool/main.py", line 233, in single_job_executor
raise WorkflowException(u"Process status is %s" % (final_status))
WorkflowException: Process status is ['permanentFail']


It looks like I didn't create folderName directory at all.

If I comment lines which collect output dirOut I don't have errors, but in output.txt file to where I save result of "ls -p" command I can see only "textfile.txt" and "output.txt".

The questions are:

1. Do I use a right way to create directory inside output directory of container?
2. Is there any way to return that newly created directory from container to a specific directory on my computer?
3. It looks like "basename" doesn't support expression type and can recognize only string. if I use basename: \$(inputs.dirName) it doesn't set the right value from the input

I would appreciate if you give me any links to working examples of commandlinetools or workflows that use cwl v1.0 (not necessarily related to Directory type)

cwl common workflow language • 3.4k views
0
Entering edit mode
4.4 years ago

Hello Misha,

My apologies for the delayed response.

You have several questions here, I will answer them in order (note, that it would be best to split them up in the future).

1. According to my reading of the spec you are correct, the reference implementation is at fault. I've created the following issues to track this down: https://github.com/common-workflow-language/cwltool/issues/226 https://github.com/common-workflow-language/cwltool/issues/227
2. Management of the outputs vary per implementation, for the reference implementation you can use --outdir.
3. Correct, you can only use an expression where Expression is listed in the specification.

The user guide for v1.0 is at http://www.commonwl.org/v1.0/CommandLineTool.html#ShellCommandRequirement The conformance tests may also give you inspiration: https://github.com/common-workflow-language/common-workflow-language/tree/master/v1.0/v1.0