I'm trying to set up a workflow with CWL and am struggling to figure out how to place the output files generated by the different steps in the workflow into their own directories. As it stands, all output files are put in the same directory, creating a lot of clutter. I would very much prefer to have different sub-directories for each step.
Right now I have something like this:
working directory -- fastq -- read_1.fq -- read_2.fq ... -- output # <-- all output files are dumped here
What I want is something like this:
working directory -- fastq -- sample1_read_1.fq -- sample1_read_2.fq -- sample2_read_1.fq -- sample2_read_2.fq ... -- trimmed -- sample1_read_1.trimmed.fq -- sample1_read_2.trimmed.fq -- sample2_read_1.trimmed.fq -- sample2_read_2.trimmed.fq ... -- bam -- sample1.bam -- sample2.bam ... -- qc -- QC reports ...
I can set up the individual command line tools to write their output to a directory but the only way I have found to make that directory show up in the output, as indicated above, is to designate the whole directory as the output of that tool. While that would produce the directory structure, I want it introduces two problems. Firstly, it makes it harder to access individual files that are required for a subsequent step in the workflow. An obvious example is keeping files for the first and second read separated properly. Secondly, I'm using toil to run this workflow, and that doesn't support directories as inputs (strictly speaking this isn't a CWL issue of course), complicating things further.
This seems like something that should be easy. Am I missing something obvious here? Any advice on how to do this would be much appreciated.