Is there any way to limit the number of job instances created when scattering a workflow? Right now I have a highly parallelizable workflow that spans into more then 1200 jobs when running in parallel, which is way to many for a single server to handle at once, and I end up getting a bunch of MAX_THREAD errors (each job takes 12 threads, the server maxes out at 4096 threads/user). Unless Im mistaken the reference cwltool doesnt have the ability to span jobs over a cluster or submit to a job queue, does it have any ability to limit the number of concurrent jobs? Ideally I would like it to behave like a FIFO job pool, but any resource management would be useful.
You are correct,
cwltool is just a reference runner and not production-hardened code. However recent versions should wait to launch jobs based upon basic resource accounting.
If each instance of one of your tools is using 12 cores, make sure to notate that with the following syntax:
hints: ResourceRequirement: coresMin: 12
See https://www.commonwl.org/v1.0/CommandLineTool.html#ResourceRequirement for the reference
You may also want to consider one of the systems that supports CWL listed at https://www.commonwl.org/#Implementations
For example, the
toil-cwl-runner can submit jobs to a grid cluster scheduler, or for a full workflow platform experience you may need Arvados.