Snakemake - pipeline shut down without error
1
0
Entering edit mode
6 months ago
bhumm ▴ 200

I have been running a relatively simple snakemake pipeline that processes bam files and aggregates a variety of metrics. When running it progresses as expected then randomly shuts down. What the stdout/log reports:

...
16 of 39 steps (41%) done
[Wed Oct  9 13:48:45 2024]
Finished job 7.
17 of 39 steps (44%) done
[Wed Oct  9 13:48:45 2024]
Finished job 34.
18 of 39 steps (46%) done
[Wed Oct  9 13:48:56 2024]
Finished job 55.
19 of 39 steps (49%) done
[Wed Oct  9 13:50:06 2024]
Finished job 25.
20 of 39 steps (51%) done
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-10-09T134300.644136.snakemake.log

When searching through the log there is no added information, errors, or tracebacks. If I rerun the command (with the --rerun-incomplete flag) it picks up at the exact job it quits at and ends up completing successfully. I have adequate CPUs(70+) and RAM(500+GB) so I can't imagine its a resource issue. I have not been able to find any other information online about this.

Any advice or ideas is appreciated!

snakemake • 699 views
ADD COMMENT
0
Entering edit mode

Did the dry run end correctly? Could you show us the rule that fails?

ADD REPLY
0
Entering edit mode

I will give a dry run a go and report back. That is the strange part - there are no specific rules that fail, the pipeline just quits(usually around 45-55% complete). If I rerun the command it picks back up and will finish as expected.

ADD REPLY
1
Entering edit mode
3 days ago
bhumm ▴ 200

Circling back to answer my own question. After testing the --dry-run (as @Shred suggested) it completed as expected. The rule that consistently failed was an aggregation step that opened and processed a substantial number of large parquet files. Despite the compute resources I had available and designating the use of all cores (--cores all) I was actually not allocating appropriate resources to the more computationally burdensome rules.

After updating the threads and mem_mb directives within the rule, my pipeline no longer crashes.

It is still rather unsatisfying that I could not capture any stdout that confirms the crash due to lack of resources, I suspect this is the issue.

TLDR; when running a computationally intensive or memory heavy step, explicitly allocate memory and/or threads to the related rule.

ADD COMMENT

Login before adding your answer.

Traffic: 1364 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6