Native C Interop
This chapter is the single normative source for Shape native C interop.
If another chapter needs native interop details, it should link here instead of redefining syntax, marshalling, or lock behavior.
Core Syntax
Section titled “Core Syntax”Use explicit language syntax, not annotation indirection:
extern C fn cos(x: number) -> number from "libm" as "cos";extern C fn getenv(name: cstring) -> cstring? from "libc" as "getenv";extern C fn hash_bytes(data: Vec<byte>) -> u64 from "libhash";
type C QuoteC { bid: f64, ask: f64,}
extern C fn quote_mid(q: cview<QuoteC>) -> f64 from "libquote";extern C fn quote_fill(q: cmut<QuoteC>, v: f64) -> void from "libquote";extern C fn ... from "<alias-or-library>" [as "<symbol>"];declares a native call.type Cdefines ABI layout-compatible data types.cview<T>andcmut<T>are pointer-backed zero-copy view carriers fortype C.Vec<T>in native signatures compiles tocslice<T>({ data, len }descriptor by value).- Explicit slice annotations are
CSlice<T>andCMutSlice<T>(ABI names:cslice<T>,cmut_slice<T>). CMutSlice<T>parameters are mutable-reference parameters in Shape call semantics.
The extern ABI may be written quoted (extern "C" fn ...) or unquoted
(extern C fn ...); both forms parse. The book uses the unquoted form
consistently.
Dependency Resolution
Section titled “Dependency Resolution”Native libraries are declared in [native-dependencies] in a project
shape.toml or script frontmatter. Shape uses one shared resolver for compile
time and runtime; the CLI, compiler, and VM do not resolve native aliases
independently.
[native-dependencies]libm = "libm.so.6"
duckdb = { provider = "system", version = "1.1.3", linux = "libduckdb.so", macos = "libduckdb.dylib", windows = "duckdb.dll" }
fastmath = { provider = "path", path = "./native/libfastmath.so" }
myrt = { provider = "vendored", cache_key = "myrt-2.0.1", targets = { "linux-x86_64" = "vendor/linux-x86_64/libmyrt.so", "linux-aarch64" = "vendor/linux-aarch64/libmyrt.so", "macos-aarch64" = "vendor/macos-aarch64/libmyrt.dylib" } }
openssl = { provider = "system", targets = { "linux-x86_64" = { value = "libssl.so.3" }, "macos-aarch64" = { value = "libssl.3.dylib" } } }- Shorthand string form means
systemunless the value looks like a filesystem path, then it meanspath. - Detailed tables may set
provider = "system" | "path" | "vendored". If omitted, Shape inferspathfor path-like values andsystemotherwise. targetskeys use normalized host IDsos-arch[-env]. Current host IDs usually look likelinux-x86_64,linux-aarch64,macos-aarch64, orwindows-x86_64.- Target selection order is exact
os-arch-env, thenos-arch, thenos. After that, legacylinux/macos/windowsfields are still accepted as a compatibility fallback. The current OS field is preferred first, thenpath, and only then the remaining legacy OS fields. systemloads by soname or by an explicit path-like value.pathresolves relative to the declaring package root when not absolute.vendoredresolves relative to the declaring package root, then copies the selected library into Shape’s native cache before loading it..shapecbundles andshape publishcurrently embed only native dependency metadata, not the referenced.so/.dylib/.dllfiles themselves.pathandvendoredentries therefore require those native files to be distributed separately on disk. Registry-published packages do not yet carry native assets inside the uploaded bundle.- Resolution is transitive: the root project, dependency packages, and embedded
.shapecbundle scopes all contribute[native-dependencies]. - Native alias resolution is package-scoped. Two active packages may both
declare the same alias name, and each
extern C fn ... from "alias"resolves against the package that declared that foreign binding.
The current duckdb package uses provider = "system" with
libduckdb.so / libduckdb.dylib / duckdb.dll, so the host must already
provide DuckDB unless the package switches to path or vendored.
Lock Behavior
Section titled “Lock Behavior”- Project mode stores native artifacts in
shape.lock. - Standalone scripts store native artifacts in
<script>.lock. - In
[build.external].mode = "update", Shape probes native libraries for the current host target and writes or refreshes matching lock artifacts. If multiple native prerequisites are broken, Shape reports all of them in one preflight error grouped by package. - In
[build.external].mode = "frozen", Shape requires a matching artifact for the current target, provider, and fingerprint. systementries that use loader names instead of paths should declareversion; frozen mode errors if a system alias has no declared version.- One committed lockfile may contain multiple native artifacts for the same
package@version::aliasacross different targets or fingerprints. Shape does not replay one foreign absolute path on every machine. - Standalone scripts currently resolve native dependencies in update mode and
refresh
<script>.lockwhen they run. - There is no separate native lockfile; native artifacts stay in the normal lock pipeline.
Out Parameters
Section titled “Out Parameters”Many C APIs use out-parameters (T* out) — the caller allocates a slot and the
function writes its result into it. Shape supports this directly with the out
keyword on extern C fn parameters.
out keyword syntax
Section titled “out keyword syntax”Mark pointer-typed parameters with out to let the compiler handle cell
allocation, the C call, value readback, and cleanup automatically:
extern C fn duckdb_open(path: string, out out_db: ptr) -> i32 from "duckdb" as "duckdb_open";extern C fn duckdb_connect(db: ptr, out out_conn: ptr) -> i32 from "duckdb" as "duckdb_connect";Callers supply only the non-out arguments. The return value is an array
containing the original return value followed by each out parameter’s value:
let [status, db] = duckdb_open("pricing_data.duckdb")let [_, conn] = duckdb_connect(db)Rules:
outparameters must have typeptr.outcannot combine withconstor&.outparameters cannot have default values.- The generated stub allocates a pointer cell, passes its address to the C function, reads back the value, and frees the cell.
Manual out-parameter pattern
Section titled “Manual out-parameter pattern”For cases that need finer control, use pointer cells directly:
from std::core::native use { ptr_new_cell, ptr_free_cell, ptr_read, ptr_write }
extern C fn duckdb_open(path: string, out_db: ptr) -> i32 from "duckdb" as "duckdb_open";
let cell = ptr_new_cell()ptr_write(cell, 0)duckdb_open("pricing_data.duckdb", cell)let db = ptr_read(cell)ptr_free_cell(cell)The pointer cell is a pointer-sized memory slot allocated by the runtime.
Marshalling Matrix
Section titled “Marshalling Matrix”| Shape type | C ABI representation | Notes |
|---|---|---|
i8 / char | int8_t | char aliases i8 at C boundary |
u8 / byte | uint8_t | byte is alias to u8 |
i16 | int16_t | Range checked |
u16 | uint16_t | Range checked |
i32 | int32_t | Range checked |
u32 | uint32_t | Range checked |
i64 / int | int64_t | int aliases i64 |
u64 | uint64_t | Range checked |
isize | intptr_t | Pointer-width signed integer |
usize | uintptr_t | Pointer-width unsigned integer |
ptr | void* | Opaque pointer carrier |
f32 | float | Preserved width |
f64 / number | double | number aliases f64 |
bool | _Bool / uint8_t ABI equivalent | Normalized to boolean |
cstring | const char* | Null return is runtime error |
cstring? | const char* nullable | Marshals to Option<string> |
callback(fn(...)->R) | Function pointer | Call-scoped callback trampoline |
Vec<T> | cslice<T> ({ data: T*, len: usize }) | T must be scalar/pointer/cstring family |
CSlice<T> | cslice<T> ({ data: T*, len: usize }) | Explicit read-only slice ABI |
CMutSlice<T> | cmut_slice<T> ({ data: T*, len: usize }) | Explicit mutable slice ABI |
cview<T> | const T* | T must be type C |
cmut<T> | T* | Mutable view; write allowed |
void | void | Marshals to () |
Rules:
- All narrowing conversions are explicit and range checked.
cstringrejects interior NUL on outbound conversion.- Use
cstring?when null pointers are valid. - Width-aware scalars are preserved across VM/wire/native paths.
Vec<T>/cslice<T>marshalling is copy-in.cmut_slice<T>marshalling is copy-in/copy-out with mandatory writeback into the referenced Shape variable after the call returns.
Coercion Rules (Normative)
Section titled “Coercion Rules (Normative)”Shape -> C (arguments and type C field writes)
Section titled “Shape -> C (arguments and type C field writes)”| Target C type | Accepted Shape values | Implicit coercions rejected |
|---|---|---|
i8, i16, i32, i64, isize | exact integer-domain values (int, fitting native ints) and bool (false -> 0, true -> 1) | floating-point values |
u8, u16, u32, u64, usize, ptr | exact non-negative integer-domain values and bool (0/1) | negative integers, floating-point values |
f32, f64 | number, f32, and inline language int (i64) | native-width i64/u64/isize/usize/ptr to float without explicit cast |
bool | bool, or exact integer-domain value (0 => false, non-zero => true) | floating-point values |
cstring | string without interior NUL | None, non-string values |
cstring? | None, Some(string), or bare string | non-string/non-option values |
cslice<T> | Vec<T> | non-array values, nested/object element types |
cmut_slice<T> | mutable reference to Vec<T> | non-reference arguments, nested/object element types |
cview<T> | matching native view cview<T> | object copies or mismatched layout names |
cmut<T> | matching native view cmut<T> | read-only view when mutable required |
Notes:
- Narrowing integer conversions are range checked.
- For lossy numeric changes, use explicit casts (
as number,as i64, etc.). type Cfield writes use the same coercion table as call arguments.cmut_slice<T>supports writeback for all supported slice element types (i8/u8/i16/u16/i32/i64/u32/u64/isize/usize/f32/f64/bool/ptr/cstring/cstring?).- Name-based calls to
extern Cfunctions auto-insert the required reference forcmut_slice<T>params; dynamic call-value sites must pass a reference explicitly.
C -> Shape (return values and type C field reads)
Section titled “C -> Shape (return values and type C field reads)”| C ABI type | Shape value |
|---|---|
i8/u8/i16/u16/i32/u32 | width-aware native scalar |
i64 | native i64 scalar (not auto-coerced to number) |
u64 | native u64 scalar (not auto-coerced to number) |
isize/usize | native pointer-width scalar |
f32 | native f32 scalar |
f64 | number |
ptr / callback pointer | ptr |
cstring | string; null pointer is runtime error |
cstring? | Option<string> (None on null) |
cslice<T> / cmut_slice<T> | Vec<T> (copied from native memory) |
cview<T> / cmut<T> | zero-copy native view wrapper; null pointer is runtime error |
Numeric Mixing Rule (VM + JIT contract)
Section titled “Numeric Mixing Rule (VM + JIT contract)”- Integer-domain operations stay in integer domain when both operands are integer-domain values.
- Mixed integer/float operations are allowed only when integer values are losslessly representable as
f64(|value| <= 2^53). - Otherwise execution requires an explicit cast and fails with a runtime type/coercion error if not cast.
Wire/Transport Rule
Section titled “Wire/Transport Rule”- Wire payloads preserve width-aware integer variants (
i64,u64,isize,usize,ptr) as typed values. - These variants are intentionally not auto-coerced by generic
as_numberhelpers.
type C Layout Contract
Section titled “type C Layout Contract”type C uses C ABI layout semantics:
- deterministic field order
- computed
size,align, and per-fieldoffset - pointer-based field access via
cview<T>/cmut<T>without object materialization
Example:
type C StatC { size: i64, mode: u32,}
extern C fn stat(path: cstring, buf: cmut<StatC>) -> i32 from "libc";type C is the production path for zero-copy native struct interop.
Arrow C -> Table<T>
Section titled “Arrow C -> Table<T>”Core builtins provide Arrow C import:
from std::core::native use { table_from_arrow_c_typed }
let result: Result<Table<MyRow>, AnyError> = table_from_arrow_c_typed(schema_ptr, array_ptr, "MyRow")schema_ptr/array_ptrmust point to Arrow C Data InterfaceArrowSchema/ArrowArray.type_namemust match a registered Shape row type.- Schema mismatches return
Result::Err(strict contract).
The builtin is defined in crates/shape-runtime/stdlib-src/core/native.shape
(public wrapper table_from_arrow_c_typed) backed by the intrinsic
__native_table_from_arrow_c_typed in
crates/shape-runtime/stdlib-src/core/intrinsics.shape.
A full package-level DuckDB proof-of-concept is included at:
shape/examples/packages/duckdb-native.
Auto Conversion Contract
Section titled “Auto Conversion Contract”The compiler auto-registers conversion pairs for compatible object/layout names:
type C FooC<->type Footype C CFoo<->type Footype C FooLayout<->type Foo
For compatible fields/types, conversion traits are generated in both
directions (From/Into + TryInto wrappers as needed). If a Shape
type Foo matches more than one type C companion by name (e.g. both
FooC and FooLayout exist), the compiler reports a hard error and
requires the project to pick one canonical companion name.
The pairing logic lives in
crates/shape-vm/src/compiler/statements.rs::maybe_generate_native_type_conversions
(invoked from register_native_struct_layout), with the name candidates in
object_type_name_for_native_layout and
native_layout_name_candidates_for_object in the same file.
Callback Contract
Section titled “Callback Contract”Callbacks are declared inline:
extern C fn qsort_i32( base: ptr, count: usize, elem_size: usize, cmp: (a: ptr, b: ptr) => i32) -> void from "libc" as "qsort";- Passing a Shape callable to
callback(...)creates a call-scoped native trampoline. - Callback argument/return types follow the same marshalling matrix.
cstring/cstring?/cslice<_>/cmut_slice<_>callback return types are currently disallowed.
JIT Execution Contract
Section titled “JIT Execution Contract”As of February 25, 2026:
CallForeignis lowered in JIT (jit_call_foreign) and no longer hard-falls back to VM dispatch.- Foreign functions are linked once per execution into a JIT foreign bridge state.
- Native
extern Centries run through the shared native ABI invoker from JIT, including callback trampolines. - Dynamic-language foreign entries (non-native ABI) still marshal through the shared runtime marshal/unmarshal path.
- Signature-specialized direct native lowering (eliminating the generic bridge call per known C signature) remains an optimization milestone.
This chapter remains the source of truth as JIT native lowering lands.
Lockfile And Frozen Mode
Section titled “Lockfile And Frozen Mode”Native dependency resolution writes lock artifacts with:
- package identity (
package_name,package_version,package_key) - alias
- host
- provider
- load target
- fingerprint
- optional declared version/cache key
Artifact keys are namespaced as <package>@<version>::<alias> to keep transitive
package native dependencies collision-safe in one lockfile.
In build.external.mode = "frozen":
- unresolved/failed native probes are rejected
- missing native lock artifacts are rejected
- system aliases without declared versions are rejected
Cross-Platform Notes
Section titled “Cross-Platform Notes”- Linux: typically
.so - macOS: typically
.dylib - Windows: typically
.dll
Always declare explicit per-platform entries when library names differ.