Question

Unsupported attributes (CPU/Memory) for pbcromwell workflow manager

0

Entering edit mode

9 months ago

Zeng Hao ▴ 40

Hi everyone, I am attempting to run the hgap4 pipeline from PacBio's SMRTtools (v10.2) for genome assembly, which uses the workflow management system Cromwell (pbcromwell). However, I am very unfamiliar with workflow managers as a whole and would like some help to understand the errors I am getting so that I can discuss with my HPC administrator to try and fix it.

The hgap4 command takes in a XML file from PacBio continuous long-read data: pbcromwell run pb_hgap4 -e "${XML_INPUT}"

Based on the logs, the run appears to start out fine, calling the workflow as intended but is immediately followed by this info message:

[2024-09-02 03:34:11,54] [info] MaterializeWorkflowDescriptorActor [?[38;5;2m7643a414?[0m]: Call-to-Backend assignments: falcon.task__0_rawreads__tan_combine -> Local, falcon.task__2_asm_falcon -> Local, consensus.gather_gff -> Local, falcon.task__1_preads_ovl__build -> Local, falcon.task__0_rawreads__report -> Local, falcon.task__1_preads_ovl__daligner_scatter -> Local, coverage_reports.target_coverage -> Local, falcon.task__0_rawreads__tan_split -> Local, coverage_reports.gc_coverage_plot -> Local, falcon.task__0_rawreads__tan_apply -> Local, mapping.cleanup_chunked_dataset_files -> Local, consensus.split_alignments -> Local, falcon.task__1_preads_ovl__daligner_split -> Local, falcon.task__0_rawreads__daligner_las_merge -> Local, coverage_reports.plot_target_coverage -> Local, falcon.task__0_rawreads__tan_scatter -> Local, pb_hgap4.task_gen_config -> Local, pb_hgap4.fasta_to_reference -> Local, falcon.task__0_rawreads__build -> Local, mapping.mapping_stats -> Local, consensus.guess_optimal_max_nchunks -> Local, mapping.pbmm2_align -> Local, falcon.task__0_rawreads__daligner_apply -> Local, coverage_reports.plot_target_coverage -> Local, coverage_reports.pbreports_coverage -> Local, consensus.genomic_consensus -> Local, coverage_reports.target_coverage -> Local, pb_hgap4.update_subreads -> Local, coverage_reports.pbreports_coverage -> Local, pb_hgap4.task_get_dextas -> Local, coverage_reports.summarize_coverage -> Local, falcon.task__0_rawreads__daligner_split -> Local, falcon.task__1_preads_ovl__daligner_las_merge -> Local, mapping.gather_alignments -> Local, consensus.gather_vcf -> Local, falcon.task__1_preads_ovl__daligner_apply -> Local, coverage_reports.summarize_coverage -> Local, pb_hgap4.dataset_filter -> Local, get_input_sizes.get_ref_size -> Local, falcon.task__1_preads_ovl__db2falcon -> Local, mapping.split_reads -> Local, mapping.auto_consolidate_alignments -> Local, pb_hgap4.polished_assembly -> Local, falcon.task__0_rawreads__daligner_scatter -> Local, consensus.gather_fasta -> Local, get_input_sizes.get_bam_size -> Local, coverage_reports.gc_coverage_plot -> Local, falcon.task__0_rawreads__cns_apply -> Local, consensus.gather_fastq -> Local

[2024-09-02 03:34:11,67] [?[38;5;220mwarn?[0m] Local [?[38;5;2m7643a414?[0m]: Key/s [cpu] is/are not supported by backend. Unsupported attributes will not be part of job executions.

[2024-09-02 03:34:11,67] [?[38;5;220mwarn?[0m] Local [?[38;5;2m7643a414?[0m]: Key/s [cpu, memory] is/are not supported by backend. Unsupported attributes will not be part of job executions.

This then transitions into a 'failed state' for the workflow manager before exiting.

[2024-09-02 03:34:26,22] [info] WorkflowManagerActor WorkflowActor-7643a414-da35-4a34-ad27-4ee0cea26b85 is in a terminal state: WorkflowFailedState

(The log has a lot more messages, but I've picked out the pertinent ones that I believe is causing the error. If anyone would like to see the full log, I'd be happy to share it).

In any case, I presume that the error lies with the CPU/Memory message above. What does "CPU, memory" not supported by backend mean? I have tried to search on Google and found a couple of posts, but I don't quite understand what is being explained in those posts.

Examples:

https://github.com/broadinstitute/cromwell/issues/4413

https://hpc-discourse.usc.edu/t/how-to-configure-cromwell-backends-to-run-on-hpc/555/3

Am I supposed to run Docker with this? And do I require a connection? The SMRTtools that is installed on our HPC is offline.

I also found this in the Cromwell docs:

https://cromwell.readthedocs.io/en/stable/RuntimeAttributes/

Do I need to specify these attributes separately in the script I submit? Our HPC uses slurm to manage jobs, and I am calling this workflow through a bash script.

I understand that it is difficult to troubleshoot this based on the information available, but I would really appreciate if anyone could suggest what might have gone wrong and possibly point me in the right direction.

Thank you very much.

Cromwell PacBio HGAP4 • 613 views

ADD COMMENT • link updated 9 months ago by pbioinf ▴ 140 • written 9 months ago by Zeng Hao ▴ 40

score 1 · Answer 1 · 2024-09-02

Make sure you aren't submitting the pbcromwell run pb_hgap4 -e "${XML_INPUT}" to a node, if cromwell is correctly installed the engine should handle submitting for you.

Difficult to diagnose with incomplete logs and without more details on your HPC and cromwell install. It looks like cromwell isn't set up properly, it is trying to run locally instead of via SLURM. Best would be to ask your HPC admin how to run a cromwell pipeline on their cluster.

I also seem to recall that pacbio had written their own wrapper around their smrt suite to run hgap4. I used their cmdline interface to invoke the pipeline, which then invoked cromwell. This did not require smrtlink.