Skip to content

Module Distribution & Signatures

Shape’s module distribution system is built on top of the content-addressed bytecode architecture. Modules are identified by content hashes, distributed as bundles of deduplicated blobs, and optionally signed with Ed25519 keys to establish authorship and trust.

This chapter covers the manifest format, blob storage, bundle packaging, signature verification, and the CLI commands that tie it all together.

Every distributable module carries a ModuleManifest that describes its contents, exports, permissions, and optional signature.

pub struct ModuleManifest {
/// Human-readable module name (e.g. "mathlib::linalg")
pub name: String,
/// Semver version string (e.g. "1.2.0")
pub version: String,
/// Exported function names mapped to their content hashes.
pub exports: HashMap<String, [u8; 32]>,
/// Type schema names mapped to their content hashes.
pub type_schemas: HashMap<String, [u8; 32]>,
/// Permissions required by this module, encoded as a permission bitmask.
pub required_permission_bits: u64,
/// Export hash → transitive list of all dependency blob hashes.
/// Computed at bundle time by walking each export's dependency graph.
pub dependency_closure: HashMap<[u8; 32], Vec<[u8; 32]>>,
/// SHA-256 of this manifest (excluding manifest_hash and signature).
pub manifest_hash: [u8; 32],
/// Optional cryptographic signature.
pub signature: Option<ModuleSignature>,
}

The required_permission_bits field encodes the transitive set of permissions this module (and its dependencies) require as a u64 bitmask — one bit per Permission variant.

A compiled module might produce the following manifest (hashes abbreviated):

ModuleManifest {
name: "mathlib::linalg",
version: "0.3.1",
exports: {
"dot_product" → ab3f...7e,
"cross_product" → c812...4a,
"mat_multiply" → 9f01...d3,
},
type_schemas: {
"Vector3" → 1a2b...8c,
"Matrix4" → 7d3e...f1,
},
required_permission_bits: 0,
manifest_hash: e4a7...92,
signature: Some(ModuleSignature { ... }),
}

When a consumer imports from this module, the loader resolves the export name to its FunctionHash via the manifest’s exports map. The actual function bytecode is then fetched from the blob store using that hash.

from mathlib::linalg use { dot_product }
→ manifest.exports["dot_product"] → ab3f...7e
→ blob_store.get(ab3f...7e) → FunctionBlob { ... }

Because exports are identified by content hash, two modules that export structurally identical functions will resolve to the same blob. The loader never needs to compare bytecode directly — hash equality is sufficient.

Each export in the manifest carries a dependency_closure — the full transitive set of blob hashes that the export depends on, computed at bundle time by walking the dependency graph of each exported function.

dependency_closure: {
ab3f...7e (dot_product) → [1c4a...b2, 9e0f...31], // helper functions
c812...4a (cross_product) → [1c4a...b2], // shares one dep
9f01...d3 (mat_multiply) → [ab3f...7e, 1c4a...b2, 9e0f...31, f7a3...e8],
}

The dependency closure is included in the manifest hash computation (sorted by key for determinism), so it is covered by signature verification. It serves two purposes:

  1. Bundle loading: When load_bundle loads a module from a .shapec file, it preloads all blobs in the dependency closure alongside the export blobs. This ensures transitive dependencies are available before any function executes, avoiding lazy-fetch stalls.

  2. Remote blob fetching: When the loader fetches missing blobs from a remote blob store, it walks the closure to fetch all transitive dependencies in a single batch rather than discovering them one at a time during execution.

Function blobs and type schemas are stored in a content-addressed blob store. The store is defined by a trait, with two built-in implementations.

pub trait BlobStore: Send + Sync {
fn get(&self, hash: &[u8; 32]) -> Option<Vec<u8>>;
fn put(&self, hash: [u8; 32], data: Vec<u8>) -> bool;
fn contains(&self, hash: &[u8; 32]) -> bool;
}
  • get returns the blob bytes for a given hash, or None if absent.
  • put inserts a blob. Returns true if the blob was newly inserted, false if it already existed.
  • contains checks for existence without fetching the data.

An in-memory store backed by a lock-guarded HashMap<[u8; 32], Vec<u8>>. Used for tests, ephemeral pipelines, and short-lived processes where persistence is unnecessary.

let store = MemoryBlobStore::new();
store.put(hash, blob_bytes);
assert!(store.contains(&hash));

A filesystem-backed store that uses a git-style two-level directory layout under ~/.shape/blobs/:

~/.shape/blobs/
ab/
3f...7e.blob
c8/
12...4a.blob
9f/
01...d3.blob

The first two hex characters of the hash form the directory name. This keeps any single directory from accumulating too many entries. The remaining hex characters (the hash with the two-character prefix stripped) form the filename, with a .blob extension. A hash abcd12...ef is stored at <root>/ab/cd12...ef.blob.

FsBlobStore is the default store for installed modules and cached compilation output.

The module loader is extended to support content-addressed modules alongside traditional source and bytecode modules.

pub enum ModuleCode {
Source(Arc<str>),
Compiled(Arc<[u8]>),
Both {
source: Arc<str>,
compiled: Arc<[u8]>,
},
ContentAddressed {
/// Serialized `ModuleManifest` (MessagePack).
manifest_bytes: Arc<[u8]>,
/// Pre-fetched blob cache: content hash → raw blob bytes.
/// Blobs not in this map are fetched from the ambient blob store.
blob_cache: Arc<HashMap<[u8; 32], Vec<u8>>>,
},
}

When the loader encounters a ContentAddressed module:

  1. It decodes manifest_bytes into a ModuleManifest to discover exports and type schemas.
  2. For each export, it resolves the content hash from manifest.exports.
  3. It fetches the corresponding FunctionBlob from blob_cache.
  4. If the blob is not in the pre-fetched cache, it fetches from the ambient blob store and caches locally.

Because blobs are identified by content hash, the same function used in two different modules is stored exactly once. If mathlib::linalg and physics::mechanics both use an identical dot_product implementation, the blob store contains a single copy. Both manifests point to the same hash.

mathlib::linalg exports "dot_product" → ab3f...7e ─┐
├─→ one blob in store
physics::mechanics exports "dot_product" → ab3f...7e ─┘

This deduplication is automatic and requires no coordination between module authors.

The .shapec bundle format is extended to carry content-addressed blobs alongside the existing metadata.

pub struct PackageBundle {
// Existing v1 fields
pub metadata: BundleMetadata,
pub modules: Vec<BundledModule>,
pub dependencies: HashMap<String, String>,
// Content-addressed fields (added in v2)
pub blob_store: HashMap<[u8; 32], Vec<u8>>,
pub manifests: Vec<ModuleManifest>,
}
  • blob_store contains the raw content-addressed blob data, keyed by hash.
  • manifests contains the module manifests, each with export maps pointing into blob_store.

The struct carries additional #[serde(default)] fields not shown here (native dependency scopes, extracted documentation); v1 readers and writers ignore them.

When a bundle contains multiple modules, blobs are deduplicated across all of them. If three modules share the same utility function, its blob appears once in the bundle’s blob_store.

The current bundle FORMAT_VERSION is 3. The minimum loadable version is 1, so the loader still accepts v1 bundles (which lack the blob_store and manifests fields):

  • v1 readers ignore the blob_store and manifests fields (MessagePack skips unknown keys).
  • v2 and later readers detect the version and load content-addressed data when present.
  • If a bundle contains no content-addressed modules, the blob_store and manifests fields are empty, and the bundle is functionally identical to v1.
OffsetSizeContent
08 bytesMagic: SHAPEPKG
84 bytesFormat version: 3 (little-endian u32)
12variableMessagePack-encoded payload

Module manifests can be signed with Ed25519 keys using the ed25519-dalek crate. Signatures bind an author’s identity to a specific manifest hash, establishing provenance and integrity.

pub struct ModuleSignatureData {
/// Ed25519 public key of the signer
pub author_key: [u8; 32],
/// Ed25519 signature bytes (64 bytes). Carried as `Vec<u8>` because
/// `serde` does not support `[u8; 64]` out of the box.
pub signature: Vec<u8>,
/// Unix timestamp (seconds since epoch) when the signature was created
pub signed_at: u64,
}

The signature field always carries exactly 64 bytes — an Ed25519 signature — but the on-the-wire representation is a length-prefixed byte vector so the manifest can round-trip through serde without a custom visitor. ModuleSignatureData::verify reconstructs the fixed-size signature from the vector before invoking ed25519_dalek.

// Sign a manifest
let sig = ModuleSignatureData::sign(manifest.manifest_hash, &signing_key);
// Verify a signature
let valid = sig.verify(manifest.manifest_hash);
// Returns true if the signature is valid for the given hash and author_key

The sign method takes the manifest hash and an Ed25519 SigningKey, produces the signature bytes, and records the current Unix timestamp. The verify method reconstructs the VerifyingKey from author_key and checks the signature against the provided hash.

The Keychain manages a set of trusted author keys. It determines whether a module signed by a given key should be accepted.

pub struct Keychain {
trusted: HashMap<[u8; 32], TrustedAuthor>,
require_signatures: bool,
}
pub struct TrustedAuthor {
pub name: String,
pub public_key: [u8; 32],
pub trust_level: TrustLevel,
}
pub enum TrustLevel {
/// Trusted for all modules.
Full,
/// Trusted only for modules whose names match one of the listed prefixes.
Scoped(Vec<String>),
/// Trusted only for a single specific manifest hash.
Pinned([u8; 32]),
}

The Keychain is keyed by public key, and its require_signatures flag decides whether unsigned modules are rejected. Trust levels provide granular control:

  • Full trusts everything signed by the author. Suitable for first-party or well-known publishers.
  • Scoped(Vec<String>) trusts only modules whose names start with one of the listed prefixes. Limits exposure if a key is compromised.
  • Pinned([u8; 32]) trusts only a single specific manifest hash. The most restrictive level — a new version of the same module requires explicitly updating the pin.

When the module loader resolves a content-addressed artifact and a keychain is configured, it calls Keychain::verify_module(module_name, manifest_hash, signature), which returns a VerifyResult:

pub enum VerifyResult {
/// Signature is valid and the author is trusted for this module.
Trusted,
/// No signature present and signatures are not required.
Unsigned,
/// Verification failed for the given reason.
Rejected(String),
}

The verification logic is:

  1. No signature present. Return Unsigned if the keychain’s require_signatures flag is false; otherwise return Rejected("module is unsigned and signatures are required").
  2. Signature present. Call signature.verify(manifest_hash) (the Ed25519 cryptographic check). If it fails, return Rejected("invalid signature").
  3. Trust check. Look up signature.author_key in the keychain via is_trusted(public_key, module_name, manifest_hash):
    • Full — accepted.
    • Scoped(prefixes) — accepted only if module_name starts with one of the prefixes.
    • Pinned(hash) — accepted only if the pinned hash equals manifest_hash. If the key is absent or the trust level rejects the module, return Rejected("author key ... is not trusted ...").
  4. Result. All checks passed — return Trusted.

If verify_module returns Rejected, the loader fails the load with a ShapeError::ModuleError whose message embeds the rejection reason. A Trusted or Unsigned result allows the load to proceed.

Module load request
→ resolve manifest, check integrity hash
→ keychain configured?
yes → verify_module(name, manifest_hash, signature)
Rejected(reason) → ShapeError::ModuleError
Unsigned → load module ✓
Trusted → load module ✓
no → load module ✓ (no signature verification)

The keychain is installed on the module loader via set_keychain. The Keychain::new(require_signatures) constructor takes the policy flag directly:

// `true` rejects unsigned modules; `false` allows them.
let mut keychain = Keychain::new(true);
keychain.add_trusted(TrustedAuthor {
name: "alice".into(),
public_key: alice_pubkey,
trust_level: TrustLevel::Full,
});
// Install the keychain on the module loader.
module_loader.set_keychain(keychain);

Once set, every content-addressed module the loader resolves goes through the verification flow above before loading.

Shape provides CLI commands for key management, signing, and verification.

Terminal window
shape sign --key ~/.shape/keys/mykey.ed25519 mylib-1.0.0.shapec

Signs the module manifest inside the bundle with the provided Ed25519 private key. The signature is written back into the bundle file.

Terminal window
shape verify mylib-1.0.0.shapec

Checks the module signature against the local keychain. Prints the signer name, trust level, and verification result.

$ shape verify mylib-1.0.0.shapec
Module: mathlib::linalg v0.3.1
Signer: alice (Full trust)
Signed at: 2025-12-01T14:30:00Z
Signature: VALID ✓
Terminal window
shape keys generate --output ~/.shape/keys/mykey.ed25519

Generates an Ed25519 keypair. The private key is written to the specified path. The public key is written to the same path with a .pub extension.

Terminal window
shape keys trust --key <pubkey-hex> --author "alice"

Adds the public key to the local keychain with Full trust level by default. Use --scope to restrict trust to specific module names, or --pin to restrict to specific manifest hashes:

Terminal window
# Trust only specific modules from this author
shape keys trust --key <pubkey-hex> --author "bob" --scope "mathlib::*"
# Pin trust to a specific manifest hash
shape keys trust --key <pubkey-hex> --author "charlie" --pin e4a7...92

The blob-level cache (BlobCache) accelerates compilation and execution by caching content-addressed blobs on disk, while a companion JitCodeCache holds JIT-compiled native code in memory.

The BlobCache has an in-memory layer and an optional disk layer. When constructed with BlobCache::with_disk(root), blobs persist under a two-level directory layout (the default root is ~/.shape/cache/blobs/):

~/.shape/cache/blobs/
ab/
3f...7e.blob
c8/
12...4a.blob

As with FsBlobStore, the first two hex characters of the hash form the directory name and the remaining characters form the .blob filename.

During compilation, the compiler checks the cache for each function before compiling:

for each function F in module:
hash = content_hash(F)
if blob_cache.has_blob(hash):
skip compilation → use cached blob
else:
compile F → blob
blob_cache.put_blob(blob)

This means unchanged functions are never recompiled, even across full rebuilds. The cache is keyed purely on content — source file timestamps and paths are irrelevant.

During execution, JIT-compiled native code is cached in JitCodeCache, an in-memory-only map (Cranelift’s JITModule cannot be serialized):

pub struct JitCodeCache {
entries: HashMap<FunctionHash, *const u8>,
}

When the VM encounters a function that should be JIT-compiled:

  1. Check the JitCodeCache for the function’s content hash.
  2. If cached, jump directly to the native code pointer.
  3. If not cached, JIT-compile the function blob, store the code pointer, and execute.

Because the cache is keyed by FunctionHash, the same function appearing in different modules or different programs reuses the same JIT output within a session.

The following walkthrough demonstrates the end-to-end flow from compilation to consumption.

Terminal window
$ cd mathlib/
$ shape build
Building package 'mathlib' v0.3.1...
Compiled 5 functions into 4 unique blobs (1 deduplicated)
Built mathlib-0.3.1.shapec (FORMAT_VERSION 3)

The compiler produces a manifest listing all exports by hash, and stores each function blob in the bundle’s embedded blob store.

Terminal window
$ shape sign --key ~/.shape/keys/alice.ed25519 mathlib-0.3.1.shapec
Signed mathlib::linalg v0.3.1
Manifest hash: e4a7...92
Author: alice (ed25519:7f2a...b1)
Timestamp: 2025-12-01T14:30:00Z

The signature is written into the bundle alongside the manifest.

Terminal window
$ shape bundle --output stdlib-0.3.1.shapec \
mathlib-0.3.1.shapec \
physics-0.3.1.shapec \
utils-0.3.1.shapec
Bundling 3 packages...
Deduplicated 12 blobs across packages (saved 3 duplicates)
Built stdlib-0.3.1.shapec (FORMAT_VERSION 3)

Blobs shared across packages are stored once in the combined bundle.

Distribute the .shapec file through any channel: file copy, HTTP, registry, git LFS. The bundle is self-contained — no external dependencies needed for verification or loading.

Terminal window
$ shape verify stdlib-0.3.1.shapec
Module: mathlib::linalg v0.3.1
Signer: alice (Full trust)
Signature: VALID
Module: physics::mechanics v0.3.1
Signer: alice (Full trust)
Signature: VALID
Module: utils::core v0.3.1
Signer: bob (Scoped: ["utils::*"])
Signature: VALID
shape.toml
[dependencies]
stdlib = { path = "./stdlib-0.3.1.shapec" }
from stdlib::mathlib::linalg use { dot_product, mat_multiply }
let result = dot_product(a, b)

The loader reads the manifest, resolves dot_product to its hash, fetches the blob from the bundle’s store, and caches it locally in ~/.shape/cache/blobs/.

On subsequent runs — or in any other project that uses the same functions — the blobs are already in the local cache:

Loading mathlib::linalg...
dot_product (ab3f...7e) → cache hit
mat_multiply (9f01...d3) → cache hit
cross_product (c812...4a) → cache hit
Loaded 3 functions (0 compiled, 3 from cache)

The content-addressed model ensures that cache hits are based purely on function identity. The same function is compiled once, cached once, and reused everywhere — regardless of which module or bundle it originally came from.