Skip to content

Standard Library: State

Shape provides a content-addressed state module for capturing, serializing, hashing, diffing, and resuming VM execution state. Every value and function in Shape is content-addressed via SHA-256, and this module exposes those primitives to user code.

use std::core::state

All state functions live under the state namespace. Nothing is injected into global scope.

A content-addressed reference to a compiled function. The hash uniquely identifies the function’s bytecode, constants, and dependencies.

pub type FunctionRef {
hash: string, // SHA-256 content hash
name: string, // Human-readable name
param_types: Vec<string>, // Content hashes of parameter types
return_type: string, // Content hash of return type
}

A single stack frame, portable and content-addressed. Contains the function reference (by hash), the instruction pointer relative to that function, and captured local state. Generic over the value type T.

pub type Frame<T> {
function: FunctionRef,
local_ip: int, // Position within the function's blob
locals: Vec<T>,
upvalues: Vec<T>?, // Captured values for closures
}

Full execution state — a chain of frames plus module-level bindings. Can be serialized, transferred, and resumed on any node that has the referenced function blobs.

pub type VmState<T> {
frames: Vec<Frame<T>>,
module_bindings: HashMap<string, T>,
timestamp: string,
}
  • frames — the full call stack, from bottom (main) to top (current function).
  • module_bindings — all module-level variable names and their current values. Restored by state::resume() so module state transfers alongside the call stack.
  • timestamp — when the capture was taken.

Lightweight capture of just the current function’s frame.

pub type FrameState<T> {
function: FunctionRef,
args: Vec<T>,
locals: Vec<T>,
upvalues: Vec<T>?,
}

Module-level state capture including bindings and type schemas.

pub type ModuleState<T> {
bindings: HashMap<string, T>,
schemas: HashMap<string, string>, // Type name → content hash
}

A ready-to-call payload that bundles a function reference with arguments for remote invocation. Does not execute — just packages for transfer.

pub type CallPayload<T> {
function: FunctionRef,
args: Vec<T>,
upvalues: Vec<T>?,
}

The difference between two values or states. Contains only what changed, enabling efficient state synchronization.

pub type Delta<T> {
changed: HashMap<string, T>, // Fields/elements that changed
removed: Vec<string>, // Keys that were removed
}

Capture the current function’s frame state.

let frame = state::capture()
// frame.function.hash → content hash of this function
// frame.args → arguments passed to this invocation
// frame.locals → current local variable values

Capture the full VM execution state — all frames in the call stack plus module-level bindings.

let vm = state::capture_all()
// vm.frames → array of Frame (top of stack = last element)
// vm.module_bindings → module-level variable names and values
// vm.timestamp → when this capture was taken

Capture module-level bindings and type schemas.

let mod_state = state::capture_module()
// mod_state.bindings → Map of module-level variable name → value
// mod_state.schemas → Map of type name → content hash

Build a ready-to-call payload without executing the function. This is the primary mechanism for remote invocation — package a function + arguments for transfer.

use std::core::transport
fn train(data: Array<Sample>, epochs: int) -> Model { ... }
let tcp = transport::tcp()
let payload = state::capture_call(train, [my_data, 100])
// payload.function.hash → content hash of train
// payload.args → [my_data, 100]
// Send to a remote node for execution:
let bytes = state::serialize(payload)
transport::send(tcp, remote_node, bytes)?

Resume full VM state. Does not return — execution continues from the captured point as if it never stopped. Both components of the VmState are restored:

  1. Frames — the call stack is rebuilt. Each frame’s function is resolved by blob hash (preferred) or name, its locals and upvalues are restored, and the instruction pointer is set to the captured local_ip relative to the function’s entry point.
  2. Module bindings — module-level variables are matched by name and restored to their captured values.
let vm = state::capture_all()
// ... transfer vm to another node ...
state::resume(vm) // → never returns, execution continues from capture point

This is the core of state migration: capture on one node, resume on another.

Re-enter a single captured function frame and return its result. Unlike resume, this does return — it re-executes the captured function from its captured state.

The resume is applied before the function body begins executing, so the function re-enters at the captured instruction pointer with the captured locals already in place, rather than starting from the beginning.

let frame = state::capture()
// ... later ...
let result = state::resume_frame(frame)

Compute the SHA-256 hash of any value. Hashing is structural:

  • Primitives (numbers, strings, bools): hash the raw value
  • Objects: hash the type schema hash + recursive field hashes
  • Arrays: hash each element, then hash the sequence
  • None/Unit: hash a sentinel
let h1 = state::hash(42) // SHA-256 of the number 42
let h2 = state::hash("hello") // SHA-256 of "hello"
let h3 = state::hash(my_trade) // SHA-256 of type hash + field hashes
// Same value → same hash, always
assert(state::hash(42) == state::hash(42))

Get a function’s content hash — the SHA-256 of its FunctionBlob (bytecode, constants, strings, and dependencies).

fn add(a: int, b: int) -> int { a + b }
let h = state::fn_hash(add)
// h is the content hash of add's compiled bytecode

Two functions with the same implementation produce the same hash, regardless of where or when they were compiled.

Get the content hash of a type’s schema definition.

type Trade {
symbol: string,
price: number,
volume: int,
}
let h = state::schema_hash("Trade")
// h is SHA-256 of ("Trade" + sorted field definitions)

Serialize any Shape value to wire format (MessagePack), returned as a byte array.

let bytes = state::serialize(my_value)
// bytes is Array<int> — MessagePack-encoded representation

Deserialize wire format bytes back to a Shape value.

let bytes = state::serialize(my_value)
let restored = state::deserialize(bytes)
// restored is structurally equal to my_value

Combined with hashing, this enables content-addressed storage:

let key = state::hash(my_value)
cache.put(key, state::serialize(my_value))
// On any node, at any time:
let value = state::deserialize(cache.get(key))

Efficient Binary Serialization for Collections

Section titled “Efficient Binary Serialization for Collections”

Collection types use optimized binary serialization instead of per-element MessagePack encoding:

TypeSerializationBenefit
Vec<number>Raw f64 bytes via content-addressed blob~2x smaller than per-element MessagePack
Vec<int>Raw i64 bytesSame benefit as number arrays
All typed arraysRaw bytes matching element sizeZero per-element overhead
Mat<number>Raw f64 bytes + row/col dimensionsNow serializable (was previously unsupported)
HashMap<K, V>Parallel key/value arraysPreserves HashMap type identity on round-trip

For example, a 1M-element Vec<number> serializes to 8 MB of raw bytes (content-addressed and chunked) instead of ~16 MB of individual MessagePack Number values.

Older snapshots using element-by-element formats remain deserializable — backward compatibility is preserved. See Wire Protocol & Optimization for full details.

Compute the delta between two values using content-hash tree comparison.

For typed objects, this compares per-field hashes and only includes changed fields in the delta. For arrays, it compares per-element. For primitives, it compares the whole value.

type Portfolio {
name: string,
cash: number,
risk_score: number,
}
let before = Portfolio { name: "Main", cash: 100000.0, risk_score: 0.3 }
let after = Portfolio { name: "Main", cash: 95000.0, risk_score: 0.35 }
let delta = state::diff(before, after)
// delta.changed → { "cash": 95000.0, "risk_score": 0.35 }
// delta.removed → []
// "name" is NOT in the delta — it didn't change

Apply a delta to a base value, producing the updated value.

let reconstructed = state::patch(before, delta)
// reconstructed is structurally equal to after

This pair enables efficient state synchronization: capture before/after, diff, transfer only the delta, and patch on the receiving end.

use std::core::transport
let tcp = transport::tcp()
// Sender:
let delta = state::diff(old_state, new_state)
transport::send(tcp, peer, state::serialize(delta))?
// Receiver:
let delta = state::deserialize(received_bytes)
let updated = state::patch(my_state, delta)

Get a reference to the function that called the current function (one frame up in the call stack). Returns None at the top level.

fn inner() {
let c = state::caller()
match c {
Some(ref) => print(f"called by {ref.name} ({ref.hash})"),
None => print("top-level call"),
}
}

Get the current function’s arguments as an array.

fn my_func(x: int, y: string) {
let a = state::args()
// a == [x, y] as Array<Any>
}

Get the current scope’s local variables as a map.

fn compute(x: int) {
let y = x * 2
let z = y + 1
let l = state::locals()
// l == { "x": x, "y": y, "z": z }
}

Using std::core::state and std::core::transport together to dispatch a function call to a remote node:

use std::core::state
use std::core::transport
let tcp = transport::tcp()
fn remote_call(destination: string, f: Any, arguments: Array<Any>) -> Any {
// Build a ready-to-call payload
let payload = state::capture_call(f, arguments)
// Serialize and send
let request = state::serialize(payload)
let response = transport::send(tcp, destination, request)?
// Deserialize the result
state::deserialize(response)
}
// Usage:
fn expensive_compute(data: Array<number>) -> number {
data.fold(0.0, |acc, x| acc + x * x)
}
let result = remote_call("10.0.0.5:9000", expensive_compute, [my_data])

Using diffing to synchronize state between peers:

use std::core::state
use std::core::transport
let tcp = transport::tcp()
fn sync_after(peers: Array<string>, f: Any, arguments: Array<Any>) -> Any {
let before = state::capture_module()
let result = f(...arguments)
let after = state::capture_module()
let delta = state::diff(before, after)
if delta.changed.len() > 0 {
let bytes = state::serialize(delta)
for peer in peers {
transport::send(tcp, peer, bytes)?
}
}
result
}

End-to-End Example: Content-Addressed Cache

Section titled “End-to-End Example: Content-Addressed Cache”

Using hashing for a global, permanent function cache:

use std::core::state
fn cached_call(store: Any, f: Any, arguments: Array<Any>) -> Any {
// Key = hash(function identity + argument values)
let key = state::hash([state::fn_hash(f), ...arguments])
match store.get(key) {
Some(bytes) => state::deserialize(bytes),
None => {
let result = f(...arguments)
store.put(key, state::serialize(result))
result
}
}
}

Same function + same args = same cache key, across any node, any program, any time.

FunctionSignatureDescription
state::capture()() -> FrameStateCapture current function frame
state::capture_all()() -> VmStateCapture full VM state (all frames)
state::capture_module()() -> ModuleStateCapture module bindings and schemas
state::capture_call(f, args)(Any, Array<Any>) -> CallPayloadBuild a ready-to-call payload
state::resume(vm)(VmState) -> neverResume full VM state
state::resume_frame(f)(FrameState) -> AnyRe-enter a captured function frame
state::hash(value)(Any) -> stringSHA-256 hash of any value
state::fn_hash(f)(Any) -> stringContent hash of a function’s blob
state::schema_hash(name)(string) -> stringContent hash of a type schema
state::serialize(value)(Any) -> Array<int>Serialize to MessagePack bytes
state::deserialize(bytes)(Array<int>) -> AnyDeserialize from MessagePack bytes
state::diff(old, new)(Any, Any) -> DeltaCompute delta between two values
state::patch(base, delta)(Any, Delta) -> AnyApply delta to a base value
state::caller()() -> FunctionRef?Reference to calling function
state::args()() -> Array<Any>Current function’s arguments
state::locals()() -> HashMap<string, Any>Current scope’s local variables