Developer Tools

Shape’s content-addressed bytecode architecture enables a suite of advanced developer tools that would be difficult or impossible with traditional mutable-offset compilation. This chapter covers hot-reload, time-travel debugging, speculative prefetch, structural code search, execution proofs, and distributed garbage collection.

Hot-Reload

HotReloader tracks which blobs are active and manages live code updates without restarting the VM.

State

struct HotReloader {
    /// Maps function name to its current content hash.
    current_mappings: HashMap<String, FunctionHash>,

    /// All known blobs, keyed by content hash.
    /// Old versions are kept until GC determines they are unreferenced.
    blob_store: HashMap<FunctionHash, FunctionBlob>,

    /// Blobs currently referenced by live call frames.
    active_references: HashSet<FunctionHash>,

    /// Timestamped history of every reload event.
    update_history: Vec<ReloadEvent>,

    /// Source paths registered for change-detection by an external watcher.
    watch_paths: Vec<PathBuf>,
}

current_mappings is the canonical name-to-hash lookup. When a function is called, the VM resolves the name through this table to find the current blob.
blob_store retains every blob the runtime has seen. Old versions stay alive as long as at least one call frame references them.
active_references is maintained by the VM as frames are pushed and popped.
update_history records every patch as a ReloadEvent so the surrounding tooling can display reload history. The reloader does not perform automatic rollback — reverting to a previous version is a matter of re-applying patches that point each name back at the old hash.
watch_paths is appended to by watch_path(path) and consumed by the surrounding file-watcher driver; the reloader itself only stores the paths.

Update Flow

A file watcher detects source changes on disk.
The compiler performs an incremental recompile. Only affected functions produce new blobs (with new content hashes). Unchanged functions keep their existing hashes.
apply_patches(patches) updates the current_mappings table so each patched function name now points to its new hash. The caller passes the recompiled function patches in as a Vec<FunctionPatch>.
Old blobs remain valid for in-flight call frames. Any function that was already on the call stack continues executing its original blob.
New calls resolve to updated blobs through the refreshed mappings.
Garbage collection: old blobs are removed when no active frames reference them.

ReloadResult

After a patch is applied, apply_patches returns a ReloadResult:

struct ReloadResult {
    /// Names of functions that received new blobs.
    functions_updated: Vec<String>,

    /// Names of functions whose source did not change.
    functions_unchanged: Vec<String>,

    /// Number of old blobs retained because live frames still reference them.
    old_blobs_retained: usize,
}

The key insight is that both old and new versions of a function coexist naturally. Content-addressing means the old blob has a different hash from the new one, so there is no risk of instruction-pointer corruption and no downtime during the update. A running function completes against its original bytecode while all new calls use the updated version.

Shape Usage

Enable hot-reload from Shape code via std::debug:

from std::debug use { HotReloader }

// Create a reloader and register one or more paths to watch.
let reloader = HotReloader::new()
reloader.watch_path(".")

// Apply patches produced by an incremental recompile. The caller is
// responsible for compiling source changes into a `Vec<FunctionPatch>`
// (typically driven by a file watcher).
let result = reloader.apply_patches(patches)
print(f"Updated: {result.functions_updated.len()}, unchanged: {result.functions_unchanged.len()}")
print(f"Old blobs retained for in-flight calls: {result.old_blobs_retained}")

// Reverting to an earlier version is done by re-applying patches whose
// `new_blob` carries the previously-active content hash. The reload
// history (`reloader.history()`) records every applied patch so tooling
// can construct the revert patch set.

Time-Travel Debugging

TimeTravel captures VM snapshots at configurable intervals during execution, allowing developers to step backward and forward through program state.

Capture Modes

enum CaptureMode {
    /// Capture a snapshot at every function entry and exit.
    FunctionBoundaries,

    /// Capture a snapshot every N instructions.
    EveryNInstructions(u64),

    /// Capture at specific instruction pointers.
    Breakpoints(Vec<usize>),

    /// No captures. This is the default.
    Disabled,
}

Enable time-travel by setting the capture mode and ring-buffer capacity before execution begins. TimeTravel::new is a two-argument constructor — the maximum snapshot count is supplied up front (see Configuration for the default):

from std::debug use { TimeTravel, CaptureMode }

let tt = TimeTravel::new(CaptureMode::FunctionBoundaries, 5000)

VmSnapshot

Each snapshot captures a complete picture of the VM at one point in time. The stack and module-binding contents are stored as two parallel tracks per ADR-006 §2.7.7 — a Vec<u64> carrying the raw 8-byte slot bits paired with a Vec<NativeKind> carrying each slot’s kind. The two tracks are walked in lockstep when the snapshot is cloned or dropped so refcounted heap shares are accounted for correctly.

struct VmSnapshot {
    /// Monotonically increasing snapshot ID.
    index: u64,

    /// Instruction pointer at the time of capture.
    ip: usize,

    /// Stack pointer at the time of capture.
    sp: usize,

    /// Current call stack depth.
    call_depth: usize,

    /// Function ID of the currently executing function, if any.
    function_id: Option<u16>,

    /// Human-readable name of the currently executing function.
    function_name: Option<String>,

    /// Total number of instructions executed before this snapshot.
    instruction_count: u64,

    /// Raw 8-byte slot bits for the live stack at capture time.
    stack_data: Vec<u64>,

    /// Parallel kind track for `stack_data`.
    /// Invariant: `stack_data.len() == stack_kinds.len()`.
    stack_kinds: Vec<NativeKind>,

    /// Raw 8-byte slot bits for module bindings at capture time.
    module_bindings_data: Vec<u64>,

    /// Parallel kind track for `module_bindings_data`.
    /// Invariant: `module_bindings_data.len() == module_bindings_kinds.len()`.
    module_bindings_kinds: Vec<NativeKind>,

    /// Why this snapshot was captured.
    reason: CaptureReason,
}

Snapshots are stored in a ring buffer. When the buffer reaches max_snapshots, the oldest snapshot is evicted.

CaptureReason

Every snapshot records why it was taken:

enum CaptureReason {
    FunctionEntry(String),
    FunctionExit(String),
    InstructionInterval(u64),
    Breakpoint(usize),
    Manual,
}

Manual captures are triggered by calling tt.capture() explicitly in user code or from the debugger prompt.

The TimeTravel struct maintains an internal cursor that points to one snapshot. Navigation methods move this cursor:

Method	Description
`step_back()`	Move the cursor to the previous snapshot.
`step_forward()`	Move the cursor to the next snapshot.
`goto(index)`	Jump the cursor to a specific snapshot by index.
`current()`	Return the snapshot at the cursor position.
`latest()`	Return the most recent snapshot (does not move cursor).
`context_window(radius)`	Return a slice of snapshots centered on the cursor, extending `radius` in each direction.

Configuration

Option	Default	Description
`max_snapshots`	10,000	Maximum number of snapshots retained in the ring buffer.
`capture_mode`	`Disabled`	Which events trigger a snapshot.

Call clear() to wipe all captured snapshots and reset the cursor.

Example: Finding a Bug with Time-Travel

Suppose a function calculate_portfolio_risk returns an unexpected value. Traditional debugging requires reproducing the issue, adding print statements, and re-running. With time-travel debugging you can inspect the execution after the fact:

from std::debug use { TimeTravel, CaptureMode }

// Enable function-boundary captures with a 5,000-snapshot ring buffer
let tt = TimeTravel::new(CaptureMode::FunctionBoundaries, 5000)

// Run the suspect code
let result = calculate_portfolio_risk(positions, pricing_data)
print("result: " + result.to_string())

// The result looks wrong. Step backward through execution:
let snap = tt.latest()
print("final function: " + snap.function_name.unwrap_or("unknown"))
print("stack at exit: " + snap.stack_data.to_string())

// Walk backward looking for where the value diverged
while tt.step_back() {
    let s = tt.current()
    if s.function_name == Some("weighted_sum") {
        print("weighted_sum snapshot at instruction " + s.instruction_count.to_string())
        print("  stack:      " + s.stack_data.to_string())
        print("  stack kinds: " + s.stack_kinds.to_string())
        print("  call depth: " + s.call_depth.to_string())
    }
}

// Found it: weighted_sum was called with stale weights.
// Jump to that snapshot for closer inspection:
tt.goto(42)
let context = tt.context_window(3)
for snap in context {
    print(snap.index.to_string() + " | " + snap.function_name.unwrap_or("?")
          + " | ip=" + snap.ip.to_string()
          + " | stack_top=" + snap.stack_data.last().to_string())
}

This workflow lets you trace the full history of execution without re-running the program. The ring buffer keeps memory usage bounded, and Disabled mode ensures zero overhead in production.

Speculative Prefetch

BlobPrefetcher warms caches ahead of execution to reduce cold-start latency, particularly important for distributed execution where blobs may need to be fetched over the network.

How It Works

Each FunctionBlob declares its static dependencies (the hashes of functions it may call) in FunctionBlob.dependencies.
The prefetcher builds a call probability graph from these dependency edges, weighted by observed call frequencies at runtime.
On function entry, the prefetcher enqueues the top-N most likely callees for background fetch.
Fetched blobs are loaded into both the blob cache (for the interpreter) and the JIT code cache (for compiled hot paths).

struct PrefetchConfig {
    /// Maximum transitive callee depth walked from each function entry.
    max_prefetch_depth: usize,
    /// Top-N most likely callees considered at each level.
    top_n_callees: usize,
    /// Probability floor for an edge to be enqueued.
    min_probability: f32,
    /// Whether prefetching is enabled at all.
    enabled: bool,
}

struct Prefetcher {
    /// Call probability graph: caller hash → vec of (callee hash, probability).
    call_graph: CallGraph,
    config: PrefetchConfig,
    /// Background fetch queue.
    prefetch_queue: Arc<Mutex<Vec<FunctionHash>>>,
    stats: PrefetchStats,
}

The prefetcher runs on a background thread and never blocks the main execution loop. If a blob is already cached, the prefetch is a no-op. This means speculative prefetch is safe to leave enabled at all times with negligible overhead.

Shape Usage

Configure speculative prefetch from Shape code:

from std::debug use { BlobPrefetcher, PrefetchConfig }

// Create a prefetcher that walks up to 2 transitive callee levels and
// considers the top 5 likely callees at each level.
let config = PrefetchConfig {
    max_prefetch_depth: 2,
    top_n_callees: 5,
    min_probability: 0.1,
    enabled: true,
}
let prefetcher = BlobPrefetcher::new(config)

// Register it with the current VM (applies globally for this execution)
prefetcher.attach()

// The prefetcher works silently in the background from this point on.
// To inspect its call graph after a run:
let graph = prefetcher.call_graph()
for (caller_hash, callees) in graph {
    for (callee_hash, probability) in callees {
        print(f"  {caller_hash[0..8]}... → {callee_hash[0..8]}... p={probability:.2}")
    }
}

Configuration

Option	Default	Description
`max_prefetch_depth`	2	Maximum transitive callee depth walked from each function entry.
`top_n_callees`	4	Number of top-probability callees considered at each level.
`min_probability`	0.1	Probability floor for an edge to be enqueued.
`enabled`	`true`	Master switch for the prefetcher.

Structural Code Search

Content-addressed blobs carry rich metadata that enables searching the codebase by structure rather than text.

Query Types

Search by signature — find all functions matching a type signature:

shape search --signature "(int, int) -> int"

This matches any function that takes two int arguments and returns an int, regardless of name.

Search by dependency — find all functions that call a specific blob:

shape search --calls-hash abc123def456...

Returns every function whose dependencies list includes the given hash.

Search by instruction pattern — find functions containing a specific opcode sequence:

shape search --opcodes "LoadConst,Mul,StoreLocal"

Matches functions whose bytecode contains the given opcode subsequence.

Signature hash matching — each function has a signature_hash derived from its parameter types and return type. Two functions with the same signature hash are structurally interchangeable (same calling convention), which is useful for refactoring and discovering alternative implementations.

Index

Structural search relies on an index built from the blob store:

struct BlobIndex {
    /// Signature hash -> list of function hashes with that signature.
    by_signature: HashMap<u64, Vec<FunctionHash>>,

    /// Callee hash -> list of caller hashes.
    callers_of: HashMap<FunctionHash, Vec<FunctionHash>>,

    /// Opcode trigrams for subsequence search.
    opcode_index: HashMap<[u8; 3], Vec<FunctionHash>>,
}

The index is updated incrementally as new blobs are added to the store.

Proof of Execution

ExecutionProof provides verifiable execution receipts. When proof mode is enabled, the VM produces a cryptographic receipt for each function execution that a third party can verify without re-executing the code.

Structure

struct ExecutionProof {
    /// Content hash of the function that was executed.
    function_hash: [u8; 32],

    /// Hash of the arguments passed to the function.
    args_hash: [u8; 32],

    /// Hash of the return value.
    result_hash: [u8; 32],

    /// Unix timestamp (seconds since epoch) when execution completed.
    timestamp: u64,

    /// Optional: hashes of intermediate states for full trace verification.
    trace: Option<Vec<[u8; 32]>>,

    /// SHA-256 over the canonical encoding of every other field, so that a
    /// proof can be integrity-checked without recomputing per-component hashes.
    proof_hash: [u8; 32],
}

The trace field, when present, contains a hash of the VM state at each instruction boundary. This forms a hash chain: each entry commits to the previous state plus the instruction that was executed.

proof_hash is computed by ExecutionProof::compute_proof_hash over the other five fields in canonical order and is checked by ExecutionProof::verify_integrity() — verification fails fast if any other field has been tampered with after the proof was produced.

Verification

There are two verification strategies:

Re-execution: given the function_hash and args_hash, fetch the blob, re-execute with the same arguments, and compare the resulting result_hash. This is the strongest guarantee but requires full re-execution.
Trace chain verification: if trace is present, verify that each hash in the chain is consistent with the previous entry. This does not require re-execution but depends on the prover having recorded the trace honestly. It is useful for auditing execution on remote or distributed nodes.

Generating Proofs

from std::debug use { ExecutionProofBuilder }

// Proofs are built incrementally: capture the function hash and (optionally)
// enable trace mode up front, record the args hash and any intermediate
// state hashes during execution, then finalize with the result hash.
let mut builder = ExecutionProofBuilder::new(function_hash).with_trace()
builder.set_args_hash(args_hash)
// ... runtime calls `builder.record_trace_step(state_hash)` at each
// instruction boundary while the function executes ...
let proof = builder.finalize(result_hash)

print("function: " + proof.function_hash.to_hex())
print("result:   " + proof.result_hash.to_hex())
print("trace entries: " + proof.trace.map(|t| t.len()).unwrap_or(0).to_string())

Distributed Garbage Collection

In a distributed deployment, blobs may be cached across multiple VM instances. Garbage collection must coordinate across all nodes to avoid collecting a blob that some remote VM still needs.

Protocol

Each VM periodically reports its active blob set: the set of content hashes present in its call frames and function table.
A global coordinator collects these reports and computes the union of all active sets.
Any blob in the distributed store that is not in the union is eligible for collection.
Blobs that are explicitly pinned (e.g., marked as entrypoints or cached for fast startup) maintain a reference count and are exempt from collection until unpinned.

Reference Counting for Pinned Blobs

struct BlobRefCount {
    /// The blob hash.
    hash: FunctionHash,

    /// Number of active pins (caches, entrypoint registrations, etc.).
    pin_count: u32,

    /// Number of VMs currently referencing this blob in a call frame.
    active_frame_count: u32,
}

A blob is eligible for collection only when both pin_count and active_frame_count are zero and no VM has reported it in its active set.

Safety

The coordinator uses a two-phase protocol:

Mark phase: collect active sets from all VMs. Any VM that fails to report within the timeout is assumed to reference all blobs (conservative).
Sweep phase: blobs not in the union and with zero reference counts are deleted from the distributed store.

This ensures that a temporarily unreachable VM never loses blobs it depends on.

Developer Tools

Hot-Reload

State

Update Flow

ReloadResult

Shape Usage

Time-Travel Debugging

Capture Modes

VmSnapshot

CaptureReason

Navigation

Configuration

Example: Finding a Bug with Time-Travel

Speculative Prefetch

How It Works

Shape Usage

Configuration

Structural Code Search

Query Types

Index

Proof of Execution

Structure

Verification

Generating Proofs

Distributed Garbage Collection

Protocol

Reference Counting for Pinned Blobs

Safety

See Also