cwl: pass array of elements with repeated input flags
0
0
Entering edit mode
3.7 years ago
cocchi.e89 ▴ 270

I am trying to create a .cwl for the GATK FilterIntervals tool that takes several files as input, each one specified by --input flag. I know about cwl itemSeparator and here how I tried to pass the argument to the .cwl:

  - id: input_read_counts
    type:
      - "null"
      - type: array
        items: File
    inputBinding:
      prefix: '--input'
      itemSeparator: ' --input '

but the code is apparently rendered as:

gatk \
FilterIntervals \
--output \
hs37d5.preprocessed_300bp.filtered.interval_list \
--input \
'/var/lib/cwl/stgb9ec701d-59f1-4e6e-81e4-a31eb2531eb9/1.WGS.M.KO0004.hdf5 --input /var/lib/cwl/stg31ff1f67-85c6-43f2-a56b-66330038efff/2.WGS.M.KO0005.hdf5'

that it misinterpreted as a single file and makes the workflow fail:

A USER ERROR has occurred: Couldn't read file /var/lib/cwl/stgb9ec701d-59f1-4e6e-81e4-a31eb2531eb9/1.WGS.M.KO0004.hdf5 --input /var/lib/cwl/stg31ff1f67-85c6-43f2-a56b-66330038efff/2.WGS.M.KO0005.hdf5

How can I resolve this? I think the problem is the quote appended around the final variable.

Thank you very much in advance for any help!!!

--------- utilities ---------

Here the full script code:

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
label: GATK FilterIntervals on docker images

hints:
  DockerRequirement:
    dockerPull: broadinstitute/gatk:latest

baseCommand: gatk
arguments: [ "FilterIntervals", "--output", "$(inputs.interval_list_file.nameroot).filtered.interval_list" ]

inputs:
  - id: annotated_intervals
    type: File?
    inputBinding:
      position: 1
      prefix: '--annotated-intervals'
  - id: blacklist_bed
    type: File
    inputBinding:
      position: 2
      prefix: '-XL'
  - id: interval_list_file
    type: File
    inputBinding:
      position: 3
      prefix: '-L'
  - id: interval_merging_rule
    type: string
    inputBinding:
      position: 4
      prefix: '--interval-merging-rule'
  - id: minimum_gc_content
    type: float?
    inputBinding:
      position: 5
      prefix: '--minimum-gc-content'
  - id: maximum_gc_content
    type: float?
    inputBinding:
      position: 6
      prefix: '--maximum-gc-content'
  - id: minimum_mappability
    type: float?
    inputBinding:
      position: 7
      prefix: '--minimum-mappability'
  - id: maximum_mappability
    type: float?
    inputBinding:
      position: 8
      prefix: '--maximum-mappability'
  - id: minimum_segmental_duplication_content
    type: float?
    inputBinding:
      position: 9
      prefix: '--minimum-segmental-duplication-content'
  - id: maximum_segmental_duplication_content
    type: float?
    inputBinding:
      position: 10
      prefix: '--maximum-segmental-duplication-content'
  - id: low_count_filter_count_threshold
    type: float?
    inputBinding:
      position: 11
      prefix: '--low-count-filter-count-threshold'
  - id: low_count_filter_percentage_of_samples
    type: float?
    inputBinding:
      position: 12
      prefix: '--low-count-filter-percentage-of-samples'
  - id: extreme_count_filter_minimum_percentile
    type: float?
    inputBinding:
      position: 13
      prefix: '--extreme-count-filter-minimum-percentile'
  - id: extreme_count_filter_maximum_percentile
    type: float?
    inputBinding:
      position: 14
      prefix: '--extreme-count-filter-maximum-percentile'
  - id: extreme_count_filter_percentage_of_samples
    type: float?
    inputBinding:
      position: 15
      prefix: '--extreme-count-filter-percentage-of-samples'
  - id: input_read_counts
    type:
      - "null"
      - type: array
        items: File
    inputBinding:
      prefix: '--input'
      itemSeparator: ' --input '

outputs:
  filtered_intervals:
    type: File
    outputBinding:
      glob: $(inputs.interval_list_file.nameroot).filtered.interval_list

and here the inputs:

annotated_intervals:
  class: File
  path: /home/enrico/Dropbox/NY/app/GATK_CNV_germline/annotateIntervals/hs37d5.annotated_intervals.tsv
blacklist_bed:
  class: File
  path: /media/enrico/cells_WGS/gatk/gatk_SV/blacklists_GATK/CNV_and_centromere_blacklist.hg19.list
interval_list_file:
  class: File
  path: /home/enrico/Dropbox/NY/app/GATK_CNV_germline/preProcessIntervals/hs37d5.preprocessed_300bp.interval_list
interval_merging_rule: OVERLAPPING_ONLY
minimum_gc_content: 0.1
maximum_gc_content: 0.9
minimum_mappability: 0.9
maximum_mappability: 1.0
minimum_segmental_duplication_content: 0.0
maximum_segmental_duplication_content: 0.5
low_count_filter_count_threshold: 5
low_count_filter_percentage_of_samples: 90.0
extreme_count_filter_minimum_percentile: 1.0
extreme_count_filter_maximum_percentile: 99.0
extreme_count_filter_percentage_of_samples: 90.0
input_read_counts:
  - { class: File, path: /media/enrico/cells_WGS/columbia/pon_gatkSV/output_cureGN_QC_M/1.WGS.M.KO0004.hdf5 }
  - { class: File, path: /media/enrico/cells_WGS/columbia/pon_gatkSV/output_cureGN_QC_M/2.WGS.M.KO0005.hdf5 }
cwl array gatk • 922 views
ADD COMMENT
2
Entering edit mode

Currently, the CWL team recommends the CWL Discourse group as the appropriate venue for user support. Some CWL developers hanged around Biostars, but I am not sure if they are still around

ADD REPLY

Login before adding your answer.

Traffic: 1793 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6