Developer Tools
Shape’s content-addressed bytecode architecture enables a suite of advanced developer tools that would be difficult or impossible with traditional mutable-offset compilation. This chapter covers hot-reload, time-travel debugging, speculative prefetch, structural code search, execution proofs, and distributed garbage collection.
Hot-Reload
Section titled “Hot-Reload”HotReloader tracks which blobs are active and manages live code updates without restarting the VM.
struct HotReloader { /// Maps function name to its current content hash. current_mappings: HashMap<String, FunctionHash>,
/// All known blobs, keyed by content hash. /// Old versions are kept until GC determines they are unreferenced. blob_store: HashMap<FunctionHash, FunctionBlob>,
/// Blobs currently referenced by live call frames. active_references: HashSet<FunctionHash>,
/// Timestamped history of every reload event. update_history: Vec<ReloadEvent>,
/// Source paths registered for change-detection by an external watcher. watch_paths: Vec<PathBuf>,}current_mappingsis the canonical name-to-hash lookup. When a function is called, the VM resolves the name through this table to find the current blob.blob_storeretains every blob the runtime has seen. Old versions stay alive as long as at least one call frame references them.active_referencesis maintained by the VM as frames are pushed and popped.update_historyrecords every patch as aReloadEventso the surrounding tooling can display reload history. The reloader does not perform automatic rollback — reverting to a previous version is a matter of re-applying patches that point each name back at the old hash.watch_pathsis appended to bywatch_path(path)and consumed by the surrounding file-watcher driver; the reloader itself only stores the paths.
Update Flow
Section titled “Update Flow”- A file watcher detects source changes on disk.
- The compiler performs an incremental recompile. Only affected functions produce new blobs (with new content hashes). Unchanged functions keep their existing hashes.
apply_patches(patches)updates thecurrent_mappingstable so each patched function name now points to its new hash. The caller passes the recompiled function patches in as aVec<FunctionPatch>.- Old blobs remain valid for in-flight call frames. Any function that was already on the call stack continues executing its original blob.
- New calls resolve to updated blobs through the refreshed mappings.
- Garbage collection: old blobs are removed when no active frames reference them.
ReloadResult
Section titled “ReloadResult”After a patch is applied, apply_patches returns a ReloadResult:
struct ReloadResult { /// Names of functions that received new blobs. functions_updated: Vec<String>,
/// Names of functions whose source did not change. functions_unchanged: Vec<String>,
/// Number of old blobs retained because live frames still reference them. old_blobs_retained: usize,}The key insight is that both old and new versions of a function coexist naturally. Content-addressing means the old blob has a different hash from the new one, so there is no risk of instruction-pointer corruption and no downtime during the update. A running function completes against its original bytecode while all new calls use the updated version.
Shape Usage
Section titled “Shape Usage”Enable hot-reload from Shape code via std::debug:
from std::debug use { HotReloader }
// Create a reloader and register one or more paths to watch.let reloader = HotReloader::new()reloader.watch_path(".")
// Apply patches produced by an incremental recompile. The caller is// responsible for compiling source changes into a `Vec<FunctionPatch>`// (typically driven by a file watcher).let result = reloader.apply_patches(patches)print(f"Updated: {result.functions_updated.len()}, unchanged: {result.functions_unchanged.len()}")print(f"Old blobs retained for in-flight calls: {result.old_blobs_retained}")
// Reverting to an earlier version is done by re-applying patches whose// `new_blob` carries the previously-active content hash. The reload// history (`reloader.history()`) records every applied patch so tooling// can construct the revert patch set.Time-Travel Debugging
Section titled “Time-Travel Debugging”TimeTravel captures VM snapshots at configurable intervals during execution, allowing developers to step backward and forward through program state.
Capture Modes
Section titled “Capture Modes”enum CaptureMode { /// Capture a snapshot at every function entry and exit. FunctionBoundaries,
/// Capture a snapshot every N instructions. EveryNInstructions(u64),
/// Capture at specific instruction pointers. Breakpoints(Vec<usize>),
/// No captures. This is the default. Disabled,}Enable time-travel by setting the capture mode and ring-buffer capacity before
execution begins. TimeTravel::new is a two-argument constructor — the
maximum snapshot count is supplied up front (see Configuration
for the default):
from std::debug use { TimeTravel, CaptureMode }
let tt = TimeTravel::new(CaptureMode::FunctionBoundaries, 5000)VmSnapshot
Section titled “VmSnapshot”Each snapshot captures a complete picture of the VM at one point in time.
The stack and module-binding contents are stored as two parallel tracks per
ADR-006 §2.7.7 — a Vec<u64> carrying the raw 8-byte slot bits paired with
a Vec<NativeKind> carrying each slot’s kind. The two tracks are walked in
lockstep when the snapshot is cloned or dropped so refcounted heap shares
are accounted for correctly.
struct VmSnapshot { /// Monotonically increasing snapshot ID. index: u64,
/// Instruction pointer at the time of capture. ip: usize,
/// Stack pointer at the time of capture. sp: usize,
/// Current call stack depth. call_depth: usize,
/// Function ID of the currently executing function, if any. function_id: Option<u16>,
/// Human-readable name of the currently executing function. function_name: Option<String>,
/// Total number of instructions executed before this snapshot. instruction_count: u64,
/// Raw 8-byte slot bits for the live stack at capture time. stack_data: Vec<u64>,
/// Parallel kind track for `stack_data`. /// Invariant: `stack_data.len() == stack_kinds.len()`. stack_kinds: Vec<NativeKind>,
/// Raw 8-byte slot bits for module bindings at capture time. module_bindings_data: Vec<u64>,
/// Parallel kind track for `module_bindings_data`. /// Invariant: `module_bindings_data.len() == module_bindings_kinds.len()`. module_bindings_kinds: Vec<NativeKind>,
/// Why this snapshot was captured. reason: CaptureReason,}Snapshots are stored in a ring buffer. When the buffer reaches max_snapshots, the oldest snapshot is evicted.
CaptureReason
Section titled “CaptureReason”Every snapshot records why it was taken:
enum CaptureReason { FunctionEntry(String), FunctionExit(String), InstructionInterval(u64), Breakpoint(usize), Manual,}Manual captures are triggered by calling tt.capture() explicitly in user code or from the debugger prompt.
Navigation
Section titled “Navigation”The TimeTravel struct maintains an internal cursor that points to one snapshot. Navigation methods move this cursor:
| Method | Description |
|---|---|
step_back() | Move the cursor to the previous snapshot. |
step_forward() | Move the cursor to the next snapshot. |
goto(index) | Jump the cursor to a specific snapshot by index. |
current() | Return the snapshot at the cursor position. |
latest() | Return the most recent snapshot (does not move cursor). |
context_window(radius) | Return a slice of snapshots centered on the cursor, extending radius in each direction. |
Configuration
Section titled “Configuration”| Option | Default | Description |
|---|---|---|
max_snapshots | 10,000 | Maximum number of snapshots retained in the ring buffer. |
capture_mode | Disabled | Which events trigger a snapshot. |
Call clear() to wipe all captured snapshots and reset the cursor.
Example: Finding a Bug with Time-Travel
Section titled “Example: Finding a Bug with Time-Travel”Suppose a function calculate_portfolio_risk returns an unexpected value. Traditional debugging requires reproducing the issue, adding print statements, and re-running. With time-travel debugging you can inspect the execution after the fact:
from std::debug use { TimeTravel, CaptureMode }
// Enable function-boundary captures with a 5,000-snapshot ring bufferlet tt = TimeTravel::new(CaptureMode::FunctionBoundaries, 5000)
// Run the suspect codelet result = calculate_portfolio_risk(positions, pricing_data)print("result: " + result.to_string())
// The result looks wrong. Step backward through execution:let snap = tt.latest()print("final function: " + snap.function_name.unwrap_or("unknown"))print("stack at exit: " + snap.stack_data.to_string())
// Walk backward looking for where the value divergedwhile tt.step_back() { let s = tt.current() if s.function_name == Some("weighted_sum") { print("weighted_sum snapshot at instruction " + s.instruction_count.to_string()) print(" stack: " + s.stack_data.to_string()) print(" stack kinds: " + s.stack_kinds.to_string()) print(" call depth: " + s.call_depth.to_string()) }}
// Found it: weighted_sum was called with stale weights.// Jump to that snapshot for closer inspection:tt.goto(42)let context = tt.context_window(3)for snap in context { print(snap.index.to_string() + " | " + snap.function_name.unwrap_or("?") + " | ip=" + snap.ip.to_string() + " | stack_top=" + snap.stack_data.last().to_string())}This workflow lets you trace the full history of execution without re-running the program. The ring buffer keeps memory usage bounded, and Disabled mode ensures zero overhead in production.
Speculative Prefetch
Section titled “Speculative Prefetch”BlobPrefetcher warms caches ahead of execution to reduce cold-start latency, particularly important for distributed execution where blobs may need to be fetched over the network.
How It Works
Section titled “How It Works”- Each
FunctionBlobdeclares its static dependencies (the hashes of functions it may call) inFunctionBlob.dependencies. - The prefetcher builds a call probability graph from these dependency edges, weighted by observed call frequencies at runtime.
- On function entry, the prefetcher enqueues the top-N most likely callees for background fetch.
- Fetched blobs are loaded into both the blob cache (for the interpreter) and the JIT code cache (for compiled hot paths).
struct PrefetchConfig { /// Maximum transitive callee depth walked from each function entry. max_prefetch_depth: usize, /// Top-N most likely callees considered at each level. top_n_callees: usize, /// Probability floor for an edge to be enqueued. min_probability: f32, /// Whether prefetching is enabled at all. enabled: bool,}
struct Prefetcher { /// Call probability graph: caller hash → vec of (callee hash, probability). call_graph: CallGraph, config: PrefetchConfig, /// Background fetch queue. prefetch_queue: Arc<Mutex<Vec<FunctionHash>>>, stats: PrefetchStats,}The prefetcher runs on a background thread and never blocks the main execution loop. If a blob is already cached, the prefetch is a no-op. This means speculative prefetch is safe to leave enabled at all times with negligible overhead.
Shape Usage
Section titled “Shape Usage”Configure speculative prefetch from Shape code:
from std::debug use { BlobPrefetcher, PrefetchConfig }
// Create a prefetcher that walks up to 2 transitive callee levels and// considers the top 5 likely callees at each level.let config = PrefetchConfig { max_prefetch_depth: 2, top_n_callees: 5, min_probability: 0.1, enabled: true,}let prefetcher = BlobPrefetcher::new(config)
// Register it with the current VM (applies globally for this execution)prefetcher.attach()
// The prefetcher works silently in the background from this point on.// To inspect its call graph after a run:let graph = prefetcher.call_graph()for (caller_hash, callees) in graph { for (callee_hash, probability) in callees { print(f" {caller_hash[0..8]}... → {callee_hash[0..8]}... p={probability:.2}") }}Configuration
Section titled “Configuration”| Option | Default | Description |
|---|---|---|
max_prefetch_depth | 2 | Maximum transitive callee depth walked from each function entry. |
top_n_callees | 4 | Number of top-probability callees considered at each level. |
min_probability | 0.1 | Probability floor for an edge to be enqueued. |
enabled | true | Master switch for the prefetcher. |
Structural Code Search
Section titled “Structural Code Search”Content-addressed blobs carry rich metadata that enables searching the codebase by structure rather than text.
Query Types
Section titled “Query Types”Search by signature — find all functions matching a type signature:
shape search --signature "(int, int) -> int"This matches any function that takes two int arguments and returns an int, regardless of name.
Search by dependency — find all functions that call a specific blob:
shape search --calls-hash abc123def456...Returns every function whose dependencies list includes the given hash.
Search by instruction pattern — find functions containing a specific opcode sequence:
shape search --opcodes "LoadConst,Mul,StoreLocal"Matches functions whose bytecode contains the given opcode subsequence.
Signature hash matching — each function has a signature_hash derived from its parameter types and return type. Two functions with the same signature hash are structurally interchangeable (same calling convention), which is useful for refactoring and discovering alternative implementations.
Structural search relies on an index built from the blob store:
struct BlobIndex { /// Signature hash -> list of function hashes with that signature. by_signature: HashMap<u64, Vec<FunctionHash>>,
/// Callee hash -> list of caller hashes. callers_of: HashMap<FunctionHash, Vec<FunctionHash>>,
/// Opcode trigrams for subsequence search. opcode_index: HashMap<[u8; 3], Vec<FunctionHash>>,}The index is updated incrementally as new blobs are added to the store.
Proof of Execution
Section titled “Proof of Execution”ExecutionProof provides verifiable execution receipts. When proof mode is enabled, the VM produces a cryptographic receipt for each function execution that a third party can verify without re-executing the code.
Structure
Section titled “Structure”struct ExecutionProof { /// Content hash of the function that was executed. function_hash: [u8; 32],
/// Hash of the arguments passed to the function. args_hash: [u8; 32],
/// Hash of the return value. result_hash: [u8; 32],
/// Unix timestamp (seconds since epoch) when execution completed. timestamp: u64,
/// Optional: hashes of intermediate states for full trace verification. trace: Option<Vec<[u8; 32]>>,
/// SHA-256 over the canonical encoding of every other field, so that a /// proof can be integrity-checked without recomputing per-component hashes. proof_hash: [u8; 32],}The trace field, when present, contains a hash of the VM state at each instruction boundary. This forms a hash chain: each entry commits to the previous state plus the instruction that was executed.
proof_hash is computed by ExecutionProof::compute_proof_hash over the
other five fields in canonical order and is checked by
ExecutionProof::verify_integrity() — verification fails fast if any other
field has been tampered with after the proof was produced.
Verification
Section titled “Verification”There are two verification strategies:
-
Re-execution: given the
function_hashandargs_hash, fetch the blob, re-execute with the same arguments, and compare the resultingresult_hash. This is the strongest guarantee but requires full re-execution. -
Trace chain verification: if
traceis present, verify that each hash in the chain is consistent with the previous entry. This does not require re-execution but depends on the prover having recorded the trace honestly. It is useful for auditing execution on remote or distributed nodes.
Generating Proofs
Section titled “Generating Proofs”from std::debug use { ExecutionProofBuilder }
// Proofs are built incrementally: capture the function hash and (optionally)// enable trace mode up front, record the args hash and any intermediate// state hashes during execution, then finalize with the result hash.let mut builder = ExecutionProofBuilder::new(function_hash).with_trace()builder.set_args_hash(args_hash)// ... runtime calls `builder.record_trace_step(state_hash)` at each// instruction boundary while the function executes ...let proof = builder.finalize(result_hash)
print("function: " + proof.function_hash.to_hex())print("result: " + proof.result_hash.to_hex())print("trace entries: " + proof.trace.map(|t| t.len()).unwrap_or(0).to_string())Distributed Garbage Collection
Section titled “Distributed Garbage Collection”In a distributed deployment, blobs may be cached across multiple VM instances. Garbage collection must coordinate across all nodes to avoid collecting a blob that some remote VM still needs.
Protocol
Section titled “Protocol”- Each VM periodically reports its active blob set: the set of content hashes present in its call frames and function table.
- A global coordinator collects these reports and computes the union of all active sets.
- Any blob in the distributed store that is not in the union is eligible for collection.
- Blobs that are explicitly pinned (e.g., marked as entrypoints or cached for fast startup) maintain a reference count and are exempt from collection until unpinned.
Reference Counting for Pinned Blobs
Section titled “Reference Counting for Pinned Blobs”struct BlobRefCount { /// The blob hash. hash: FunctionHash,
/// Number of active pins (caches, entrypoint registrations, etc.). pin_count: u32,
/// Number of VMs currently referencing this blob in a call frame. active_frame_count: u32,}A blob is eligible for collection only when both pin_count and active_frame_count are zero and no VM has reported it in its active set.
Safety
Section titled “Safety”The coordinator uses a two-phase protocol:
- Mark phase: collect active sets from all VMs. Any VM that fails to report within the timeout is assumed to reference all blobs (conservative).
- Sweep phase: blobs not in the union and with zero reference counts are deleted from the distributed store.
This ensures that a temporarily unreachable VM never loses blobs it depends on.
See Also
Section titled “See Also”- Content-Addressed Bytecode — the foundational architecture that enables these tools
- Resumability & Distributed Computing — checkpointing and resume workflows
- JIT Compilation — how the JIT interacts with blob caching and prefetch