Question: cwlexec and cwltool not interpreting baseCommand the same way
0
gravatar for drkennetz
6 months ago by
drkennetz360
drkennetz360 wrote:

I wrote a workflow to see where a bam file intersects with multiple bed files, each bed file corresponds to regions in different chromosomes. The output from step 2 (which uses bedtools intersect) is in the following format:

chr1    1    2    400    0
chr1    3    4    176    0
...
...
chr12    500    501    300    1
chr12    501    502    176    1

The first column is chr number (output from bedtools genomecov -ibam input.bam -a) to output all regions. Then I have another tool that is:

bedtools -a outgenome.txt -b chr12.bed -c

which outputs all positions of the outgenome.txt with column 4 being # reads at each position and column 5 is 0 if it does not intersect with the bed, and 1 if it does. I then wrote a final tool to extract only column 5 with a value of 1 (which intersect with the bed). the tool is:

#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool
requirements:
 - class: ShellCommandRequirement
inputs:
  atoBComparison:
    type: File
    inputBinding:
      position: 1
outputs:
  bRegionsNonZero:
    type: stdout
stdout: $(inputs.atoBComparison.basename)_nonZero.txt
baseCommand: [awk, "$5!=0"]

The tool runner prints the base command as it should (so awk works):

awk' '$5!=0' '/users/dkennetz/tmp7l0ehcvy/stg2d08cfbe-1cd1-4ce2-a39a-f9c1ae65f504/PromChr12.bed_AtoB.txt' > /users/dkennetz/tmp9us6fg_b/PromChr12.bed_AtoB.txt_nonZero.txt

but when I run the workflow cwlexec inputs the awk command differently:

/bin/sh -c 'awk $5!=0 filename'

which is read differently by the interpreter. It thinks the filename is in the awk command, causing it to fail. Any ideas about changing the tool to fix this, or does this seem like a bug?

cwl bed • 302 views
ADD COMMENTlink modified 6 months ago by Michael R. Crusoe1.1k • written 6 months ago by drkennetz360

Thanks for this question; it has been turned into a new conformance test for the CWL standard! https://github.com/common-workflow-language/common-workflow-language/pull/701

ADD REPLYlink written 6 months ago by Michael R. Crusoe1.1k
2
gravatar for Michael R. Crusoe
6 months ago by
currently based out of Vilnius, Lithuania / Common Workflow Language project
Michael R. Crusoe1.1k wrote:

Hello @drkennetz,

You CWL is valid, so this is likely be a bug with cwlexec that should be reported at https://github.com/IBMSpectrumComputing/cwlexec/issues

I tried to do the following

  1. Removing ShellCommandRequirement as that isn't needed →No change.
  2. moving the $5!=0 to the arguments section →The $5!=0 appears to be quoted, but I get the same awk: line 1: syntax error at or near != error
  3. Restoring ShellCommandRequirement with an explicit shellQuote: true (the default value) →Same as above

This reveals that cwlexec is allowing arguments to be evaluated by the shell instead of passing them verbatim to the tool, which is contrary to the CWL standard; so please file a bug report with IBM. In the meantime, here is a workaround that works with both cwlexec and other compliant CWL implementations:

#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool
requirements:
 ShellCommandRequirement: {}
inputs:
  atoBComparison:
    type: File
    inputBinding:
      position: 1
outputs:
  bRegionsNonZero:
    type: stdout
stdout: $(inputs.atoBComparison.basename)_nonZero.txt
baseCommand: awk
arguments:
 - valueFrom: "'$5!=0'"
   shellQuote: false
ADD COMMENTlink written 6 months ago by Michael R. Crusoe1.1k
1

Thanks for your diligence with the CWL stuff Michael! I think you are apart of something pretty awesome, and the dedication you guys put forth making CWL better is noticable. Keep up the good work, and thanks for the complete answer.

ADD REPLYlink written 6 months ago by drkennetz360
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 528 users visited in the last hour