Hello all
I'm using CWL v1.0 and trying to understand how I can create writeable directory inside output directory of container and then return it to a specific place on my computer. The question came after trying to run STAR in container from inside cwl file.
The general structure of the command, that I want to implement
STAR --runMode genomeGenerate --genomeDir /some/path/to/output/folder [other parameters]
So when I run my cwl and set --genomeDir
to ./
it works fine, because cwltool mounts output directory as writable and STAR can put results there. Then I use ./
as glob in output parameter and return all the data I need from container to my computer.
Part of cwl file to describe how I set inputs and outputs:
<skipped lines>
inputs:
genomeDir:
type: string
inputBinding:
position: 1
prefix: --genomeDir
<skipped lines>
outputs:
indices:
type: Directory
outputBinding:
glob: $(inputs.genomeDir)
But when I try to set genomeDir to any other folder, for example ./dm3
, STAR gives me an error, that I need first to create the dm3 folder.
To play a little bit with this issue I created simple cwl file to understand how to solve my problem
My cwl file (I followed an example http://www.commonwl.org/v1.0/UserGuide.html#Creating_files_at_runtime)
cwlVersion: v1.0
class: CommandLineTool
hints:
DockerRequirement:
dockerPull: ubuntu
baseCommand: ls
arguments: ["-p"]
stdout: output.txt
requirements:
InitialWorkDirRequirement:
listing:
- entryname: $(inputs.fileName)
entry: Some text inside the file
- class: Directory
basename: folderName
listing: []
inputs:
dirName:
type: string
fileName:
type: string
outputs:
output:
type: stdout
fileOut:
type: File
outputBinding:
glob: $(inputs.fileName)
dirOut:
type: Directory
outputBinding:
glob: folderName
Job file (in this case I don't actually use dirName, because I set it in cwl as string)
dirName: new_folder
fileName: textfile.txt
When I run it I receive the following error
cwl-runner --debug createfile.cwl job.yml
/usr/local/bin/cwl-runner 1.0.20160930152149
[job createfile.cwl] initializing from file:///Users/kot4or/workspaces/cwl_ws/sandbox/create_directory/createfile.cwl
[job createfile.cwl] {
"dirName": "new_folder",
"fileName": "textfile.txt"
}
[job createfile.cwl] path mappings is {}
[job createfile.cwl] command line bindings is [
{
"position": [
-1000000,
0
],
"datum": "ls"
},
{
"position": [
0,
0
],
"datum": "-p"
}
]
[job createfile.cwl] /var/folders/sd/41rg42_16q72_2yzl_vvgsbw0000gn/T/tmpFDSLKV$ docker \
run \
-i \
--volume=/private/var/folders/sd/41rg42_16q72_2yzl_vvgsbw0000gn/T/tmpFDSLKV:/var/spool/cwl:rw \
--volume=/private/var/folders/sd/41rg42_16q72_2yzl_vvgsbw0000gn/T/tmpbW3Lpe:/tmp:rw \
--workdir=/var/spool/cwl \
--read-only=true \
--log-driver=none \
--user=501 \
--rm \
--env=TMPDIR=/tmp \
--env=HOME=/var/spool/cwl \
ubuntu \
ls \
-p > /var/folders/sd/41rg42_16q72_2yzl_vvgsbw0000gn/T/tmpFDSLKV/output.txt
[job createfile.cwl] initial work dir {
"_:cf81ebd5-810c-40b0-bb68-040f0322ca40": [
"_:cf81ebd5-810c-40b0-bb68-040f0322ca40",
"/var/folders/sd/41rg42_16q72_2yzl_vvgsbw0000gn/T/tmpFDSLKV",
"Directory"
],
"_:bb364b67-6b01-4b9d-996a-0bc227a24489": [
"_:bb364b67-6b01-4b9d-996a-0bc227a24489",
"/var/folders/sd/41rg42_16q72_2yzl_vvgsbw0000gn/T/tmpFDSLKV/folderName",
"Directory"
],
"_:7be322d4-0e8f-4edd-b852-8a113fdeb5fe": [
"Some text inside the file",
"/var/folders/sd/41rg42_16q72_2yzl_vvgsbw0000gn/T/tmpFDSLKV/textfile.txt",
"CreateFile"
]
}
Error collecting output for parameter 'dirOut'
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/cwltool/draft2tool.py", line 383, in collect_output_ports
ret[fragment] = self.collect_output(port, builder, outdir, fs_access, compute_checksum=compute_checksum)
File "/usr/local/lib/python2.7/site-packages/cwltool/draft2tool.py", line 474, in collect_output
raise WorkflowException("Did not find output file with glob pattern: '{}'".format(globpatterns))
WorkflowException: Did not find output file with glob pattern: '['folderName']'
Error while running job: Error collecting output for parameter 'dirOut': Did not find output file with glob pattern: '['folderName']'
[job createfile.cwl] completed permanentFail
[job createfile.cwl] {}
Final process status is permanentFail
[job createfile.cwl] Removing input staging directory /var/folders/sd/41rg42_16q72_2yzl_vvgsbw0000gn/T/tmpLhLC5b
[job createfile.cwl] Removing temporary directory /var/folders/sd/41rg42_16q72_2yzl_vvgsbw0000gn/T/tmpbW3Lpe
Workflow error, try again with --debug for more information:
Process status is ['permanentFail']
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/cwltool/main.py", line 677, in main
**vars(args))
File "/usr/local/lib/python2.7/site-packages/cwltool/main.py", line 233, in single_job_executor
raise WorkflowException(u"Process status is %s" % (final_status))
WorkflowException: Process status is ['permanentFail']
It looks like I didn't create folderName directory at all.
If I comment lines which collect output dirOut I don't have errors, but in output.txt file to where I save result of ls -p
command I can see only textfile.txt
and output.txt
.
The questions are:
- Do I use a right way to create directory inside output directory of container?
- Is there any way to return that newly created directory from container to a specific directory on my computer?
- It looks like "basename" doesn't support expression type and can recognize only string.
if I use basename: $(inputs.dirName)
it doesn't set the right value from the input
I would appreciate if you give me any links to working examples of commandlinetools or workflows that use cwl v1.0 (not necessarily related to Directory type)