Skip to content

Resumability & Distributed Computing

Shape supports checkpointing VM execution state and resuming from that state.

from std::core::snapshot use { Snapshot }
match snapshot() {
Snapshot::Hash(id) => {
print("snapshot saved: " + id)
exit(0)
}
Snapshot::Resumed => {
print("resumed from snapshot")
}
}

Semantics:

  • first pass: snapshot() returns Snapshot::Hash(id)
  • resumed pass: same snapshot() site returns Snapshot::Resumed
  • code after snapshot() is expected to run again on resume
Terminal window
shape --resume <snapshot-hash>

This restores saved VM snapshot + saved bytecode directly.

Use this for exact continuation with no source changes.

Terminal window
shape --resume <snapshot-hash> path/to/script.shape

This restores runtime context, recompiles current source, remaps the saved snapshot() position by ordinal, then resumes on the new bytecode.

Recompile-and-resume currently requires the checkpointed snapshot() to be at a top-level continuation boundary.

If the saved snapshot has a non-empty call stack, resume fails with an error.

Practical interpretation:

  • snapshot() inside deep function frames is supported for full resume
  • recompile-and-resume is currently for top-level snapshot boundaries

Safe Edit Guidance for Recompile-and-Resume

Section titled “Safe Edit Guidance for Recompile-and-Resume”

When using shape --resume <hash> script.shape:

  1. keep the resumed snapshot boundary stable (same logical snapshot() point)
  2. keep snapshot ordinal before that point stable
  3. prefer edits after the resume boundary
  4. avoid changing pre-boundary control flow unless you are ready to re-check mapping

The current remap mechanism is ordinal-based over snapshot() call sites. Each snapshot() call site is assigned a compile-time ordinal — 0, 1, 2, and so on — in source order. When you recompile and resume, the runtime maps saved snapshot sites to new bytecode positions by matching ordinals rather than absolute bytecode offsets. As long as the ordinal of your checkpoint doesn’t change (no new snapshot() calls inserted before it), the resume works even if the bytecode around it has shifted.

Function-level checkpointing is supported in normal/full resume flow.

Example pattern:

from std::core::snapshot use { Snapshot }
fn checkpointed(x) {
let s = snapshot()
match s {
Snapshot::Hash(id) => id,
Snapshot::Resumed => x + 1
}
}

This is valid and resumes correctly in full resume mode.

Await annotations can orchestrate resumable/distributed workflows:

  • before can trigger snapshot-aware handoff logic
  • after can consume resumed results
  • { args, state } contract carries state across hook boundaries

This is the base for remote-dispatch patterns described in the cookbook.

Snapshot/resume uses the VM bytecode executor path.

Operationally, resume flows are currently VM-based even if a script was invoked with JIT mode elsewhere. The reason is architectural: JIT-compiled code is translated to native machine code using absolute offsets derived from the original program’s bytecode layout. After a recompile, those offsets are stale — re-entering JIT code at the wrong offset would corrupt execution. The content-addressed VM interpreter, by contrast, executes function blobs directly by opcode and resolves operands relative to each blob’s own instruction array, so it can safely resume at the remapped ordinal position without risk of stale machine code. Once resumed and running normally, the JIT may re-compile hot paths from the new bytecode.

  • pause/resume long-running ingestion pipelines
  • explicit workflow checkpoints before risky transformations
  • distributed continuation (snapshot hash handed to another worker)
  • failure recovery with deterministic restart boundaries

High-level snapshot state includes:

  • VM instruction pointer and stack
  • locals/call-stack state
  • module bindings and runtime context linkage
  • references to persisted snapshot artifacts (VM snapshot + bytecode)
  • snapshot store is enabled automatically for script execution
  • Ctrl+C path attempts to persist a snapshot and prints resume command
  • use lockfile-pinned artifacts for reproducible external data dependencies