Question: CWL: Getting output file of a CommandLineTool using glob
1
gravatar for skanwal
28 days ago by
skanwal20
skanwal20 wrote:

Hi,

I am trying to run the following CommandLineTool description:

#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool

hints:
 DockerRequirement:
  dockerPull: quay.io/biocontainers/kallisto:0.44.0--h7d86c95_2

inputs:
 fastqs:
   type: File[]
   inputBinding: {}

 index:
   type: File
   inputBinding:
     prefix: "--index"

baseCommand: [ "kallisto", "quant" ]

arguments:
  - valueFrom: $(runtime.outdir)
    prefix: --output-dir

outputs:
 quantification:
  type: File
  outputBinding:
   glob: abundance.tsv

Using this command:

cwltool kallisto-quant.cwl kallisto-quant.json

It throws the following error:

Error collecting output for parameter 'quantification':
kallisto-quant.cwl:31:4: Did not find output file with glob pattern: '['abundance.tsv']'

The complete docker command looks fine to me and runs successfully without cwltool (with the same arguments).

The complete output of the cwltool run command is:

(cwl) 4180L-137952-M ~/Documents/UMCCR/Play/cwl-metrics/kallisto $ cwltool kallisto-quant.cwl kallisto-quant.json
/Users/kanwals/virtualenvironment/cwl/bin/cwltool 1.0.20180719140605
Resolved 'kallisto-quant.cwl' to 'file:///Users/kanwals/Documents/UMCCR/Play/cwl-metrics/kallisto/kallisto-quant.cwl'
[job kallisto-quant.cwl] /private/tmp/docker_tmp1tk42cjr$ docker \
    run \
    -i \
    --volume=/private/tmp/docker_tmp1tk42cjr:/var/spool/cwl:rw \
    --volume=/private/var/folders/_v/g24brqws13v3gtrvcz2j57ldbdw8jg/T/tmp8t7700g9:/tmp:rw \
    --volume=/Users/kanwals/Documents/UMCCR/Play/cwl-metrics/kallisto/input/fusion-1_1.fq:/var/lib/cwl/stg4024e429-2f7e-40f2-b701-5b12dd6c3cad/fusion-1_1.fq:ro \
    --volume=/Users/kanwals/Documents/UMCCR/Play/cwl-metrics/kallisto/input/fusion-1_2.fq:/var/lib/cwl/stg38cb4274-de7f-41ba-bd37-050cd3825377/fusion-1_2.fq:ro \
    --volume=/Users/kanwals/Documents/UMCCR/Play/cwl-metrics/kallisto/index/GRCh37.idx:/var/lib/cwl/stg3bc0fbee-9d95-480b-8cb6-23a2862b6f0d/GRCh37.idx:ro \
    --workdir=/var/spool/cwl \
    --read-only=true \
    --user=1457398319:2094513965 \
    --rm \
    --env=TMPDIR=/tmp \
    --env=HOME=/var/spool/cwl \
    --memory=1024m \
    quay.io/biocontainers/kallisto:0.44.0--h7d86c95_2 \
    kallisto \
    quant \
    --output-dir \
    out \
    /var/lib/cwl/stg4024e429-2f7e-40f2-b701-5b12dd6c3cad/fusion-1_1.fq \
    /var/lib/cwl/stg38cb4274-de7f-41ba-bd37-050cd3825377/fusion-1_2.fq \
    --index \
    /var/lib/cwl/stg3bc0fbee-9d95-480b-8cb6-23a2862b6f0d/GRCh37.idx

[quant] fragment length distribution will be estimated from the data
[index] k-mer length: 31
[index] number of targets: 196,501
[index] number of k-mers: 116,739,414
[job kallisto-quant.cwl] Job error:
Error collecting output for parameter 'quantification':
kallisto-quant.cwl:31:4: Did not find output file with glob pattern: '['abundance.tsv']'
[job kallisto-quant.cwl] completed permanentFail
{}
Final process status is permanentFail

Any help to resolve this issue will be highly appreciated. Thanks.

ADD COMMENTlink modified 27 days ago by inutano10 • written 28 days ago by skanwal20
1
gravatar for inutano
27 days ago by
inutano10
inutano10 wrote:

The latest version of cwltool set memory resource 1024m in default ("--memory=1024m" In the log file) and kallisto can't run with the amount of memory. As I know cwltool 1.0.20180403145700 didn't set the default resource value, but 1.0.20180711112827 does.

So you have to set the minimum memory amount in the CWL file like below:

requirements:
  ResourceRequirement:
    ramMin: 4096

And it worked for me. Try it!

ADD COMMENTlink written 27 days ago by inutano10

Thanks @inutano. It did the trick :)

ADD REPLYlink written 26 days ago by skanwal20
0
gravatar for kaushik.ghose
27 days ago by
kaushik.ghose30 wrote:

Hi! On the face of it the tool is not producing the abundance.tsv. When you run the tool bare (by yourself) do you see that .tsv file produced?

ADD COMMENTlink written 27 days ago by kaushik.ghose30
0
gravatar for skanwal
27 days ago by
skanwal20
skanwal20 wrote:

Hi Kaushik,

Thanks for replying.

Yes, I have run this tool a few times, using a conda installation and also it's docker image. The tool produces three files:

abundance.h5
abundance.tsv
run_info.json

The docker command produced by cwltool ((provided in the original question) also looks fine to me.

I used the following command to run kallisto using its docker image:

docker run -v $PWD:/home/kallisto quay.io/biocontainers/kallisto:0.44.0--h7d86c95_2 kallisto quant --output-dir /home/kallisto/out --index /home/kallisto/index/GRCh37.idx /home/kallisto/input/fusion-1_1.fq /home/kallisto/input/fusion-1_2.fq

And this did produce the three output files. So, I am confused what I am missing from cwltool definition or what is the diffreence between this command and the one produced by cwltool run.

ADD COMMENTlink written 27 days ago by skanwal20

Could you try "*.tsv" for the glob please?

ADD REPLYlink written 27 days ago by kaushik.ghose30

Same error:

[job kallisto-quant.cwl] Job error:
Error collecting output for parameter 'quantification':
kallisto-quant.cwl:31:4: Did not find output file with glob pattern: '['*.tsv']'
[job kallisto-quant.cwl] completed permanentFail
{}
Final process status is permanentFail
ADD REPLYlink written 27 days ago by skanwal20

Are the files being produced in the working directory? Do you have Rabix Executor installed by any chance: (https://github.com/rabix/bunny). Might be worth trying on that too.

ADD REPLYlink written 27 days ago by kaushik.ghose30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 582 users visited in the last hour