String concatenation for output
2
0
Entering edit mode
5.9 years ago
ttom ▴ 220

I am trying to name my output file based on input file name

cat test.cwl

cwlVersion: v1.0
class: CommandLineTool
baseCommand: [sh, fastqc_check.sh]

requirements:
  - class: InlineJavascriptRequirement

inputs:
 fq1_zips:
  type: File
  inputBinding:
   position: 1

outputs:
 fastqc_check_out:
  type: File
  outputBinding:
    glob: $(inputs.fq1_zips.path + '.results')

cat test.yml

fq1_zips:
        class: File
        path: R1_001_fastqc.zip

fastqc_check_script:
 class: File
 path: fastqc_check.sh

Error I am getting

cwl-runner test.cwl test.yml 
/Softwares/conda/bin/cwl-runner 1.0.20180521150620
Resolved 'test.cwl' to '/standalone/test.cwl'
[job test.cwl] /tmp/tmpC3PfP4$ sh \
   /standalone/fastqc_check.sh \
    /tmp/tmpI_W5Z1/stg41acb21e-f0fe-4f8d-ae4c-cbde594f49c6/R1_001_fastqc.zip
R1_001_fastqc.zip
[job test.cwl] Job error:
Error collecting output for parameter 'fastqc_check_out':
test.cwl:20:4: glob patterns must not start with '/'
[job test.cwl] completed permanentFail
{}
Final process status is permanentFail

This shell script outputs a file named R1_001_fastqc.zip.results, this is what I want to capture in the cwl

sh fastqc_check.sh R1_001_fastqc.zip

If this line is given in the glob, glob: $(inputs.fq1_zips + '.results')

Error

cwl-runner test.cwl test.yml 
/Softwares/conda/bin/cwl-runner 1.0.20180521150620
Resolved 'test.cwl' to '/standalone/test.cwl'
[job test.cwl] /tmp/tmpcBEW38$ sh \
    /standalone/fastqc_check.sh \
    /tmp/tmpvdRGjb/stgdee3f23e-0476-413a-ad96-eb1914f71b82/R1_001_fastqc.zip
R1_001_fastqc.zip
[job test.cwl] Job error:
Error collecting output for parameter 'fastqc_check_out':
test.cwl:20:4: Did not find output file with glob pattern: '[u'[object Object].results']'
[job test.cwl] completed permanentFail
{}
Final process status is permanentFail
CWL • 1.8k views
ADD COMMENT
3
Entering edit mode
5.9 years ago
biokcb ▴ 170

I think the first error is related to using inputs.fq1_zips.path, instead of inputs.fq1_zips.basename. If you look at the path value, it uses an absolute path to your file. I believe that it's an issue because 1) glob doesn't work with absolute paths just yet and 2) glob is searching the runtime working directory for your outputs. At least from what I understand about CWL..

In the second example you are telling it to glob for the file object itself, which contains all the information about the file, so that won't work pattern wise anyhow.

Try using

cwlVersion: v1.0
class: CommandLineTool
baseCommand: [sh, fastqc_check.sh]

inputs:
 fq1_zips:
  type: File
  inputBinding:
   position: 1

outputs:
 fastqc_check_out:
  type: File
  outputBinding:
    glob: $(inputs.fq1_zips.basename).results

Notice the change in how the glob is formatted and that InlineJavascriptRequirement isn't actually needed in this case. This should output and collect R1_001_fastqc.zip.results

ADD COMMENT
0
Entering edit mode

Yes, it worked. Also as you said InlineJavascriptRequirement is not required in this case. Thanks a lot !!

Worked code: cat test.cwl

cwlVersion: v1.0
class: CommandLineTool
baseCommand: [sh, fastqc_check.sh]

inputs:
 fq1_zips:
  type: File
  inputBinding:
   position: 1

outputs:
 fastqc_check_out:
  type: File
  outputBinding:
    glob: $(inputs.fq1_zips.basename).results
ADD REPLY
0
Entering edit mode
5.9 years ago
drkennetz ▴ 560

consider changing the tool to know that it is accepting a script:

cwlVersion: v1.0
class: CommandLineTool
baseCommand: [sh]
inputs:
  script:
    type: File
    inputBinding:
      position: 1
    default:
      class: File
      location: fastqc_check.sh
  fq1_zips:
    type: File
    inputBinding:
      position: 2
outputs:
  fastqc_check_out:
    type: File
    outputBinding:
      glob: "*.results"

This, in my opinion, is a bit more complete. You want cwl to know everything you are doing. In input1 you are letting cwl know that you have a script as input, and input2 will be the actual file you want to input. The syntax above would be correct for globbing the results file, but you don't necessarily need to do that if you know that the output of the script will always be *.results. That is more of a preference thing.

Dennis

ADD COMMENT
1
Entering edit mode

FYI: It would be more correct if it was baseCommand: fastqc_check.sh and either a DockerRequirement and/or a SoftwareRequirement was used.

ADD REPLY
1
Entering edit mode

@drkennetz Yes, this should also work to give my script name as input. Have tried this before.

Also by giving glob: *.results was also working, But I needed a specific *.results file with respect to my input file (fq1_zips), because the same directory would have other files which would also have .results as extension.

But Thank you for your response !!

ADD REPLY

Login before adding your answer.

Traffic: 3429 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6