*Edit for future readers: I assume the problem here was that the tool i was running tried to contact the servers of the manufacturer. It wouldn't continue working unless the connection was successful, but wouldn't indicate what it was trying to do. The
--preserve-entire-environment flag of cwltool fixed the problem because it enabled to software to communicate through the network.
I'm new to cwl and not very experienced in programming in general, so please bear with me.
I have created CommandLineTool which is supposed to use Oxford Nanopore's albacore basecaller to generate .fastq-Files from .fast5 current data.
When using this CommandLineTool, it takes about five minutes to generate all output files in the output directory. However, the cwl-runner job does not end for several more hours. When i invoke the basecaller manually from the command line, using the same parameters i use in the CommandLineTool, it generates the exact same data and then finishes in about 5 minutes.
Could this be caused by a faulty outputs-field? I am still having difficulties understanding how outputs in cwl work. But the fact that the tool takes multiple hours to terminate makes it difficult for me to trace where my mistake lies.
The outputs i am expecting are some report files: "configuration.cfg", "pipeline.log", "sequencing_summary.txt", "sequencing_telemetry.js" As well as a directory called "workspace" containing several subdirectories filled with .fastq-files.
This is the code of the CommandLineTool:
cwlVersion: v1.0 class: CommandLineTool baseCommand: read_fast5_basecaller.py inputs: input_directory: label: | Folder of current data in .fast5 format. type: Directory inputBinding: prefix: --input worker_threads: label: | Number of CPU-Cores used for computation. type: int inputBinding: prefix: --worker_threads flowcell: label: | Type of flowcell used in experiment. type: string inputBinding: prefix: --flowcell kit: label: | Type of kit used in experiment. type: string inputBinding: prefix: --kit output_directory: label: | Folder where albacore saves results. type: string inputBinding: prefix: --save_path outputs: sequences: type: type: array items: File outputBinding: glob: $(inputs.output_directory+"/workspace/pass/*.fasta") config: type: File outputBinding: glob: $(inputs.output_directory+"configuration.cfg") pipeline: type: File outputBinding: glob: $(inputs.output_directory+"pipeline.log") summary: type: File outputBinding: glob: $(inputs.output_directory+"sequencing_summary.txt") telemetry: type: File outputBinding: glob: $(inputs.output_directory+"sequencing_telemetry.txt")
Thanks in advance for any help/advice.
edited the post to make it shorter & more comprehensible