Ownership Deep Dive

This chapter is the full specification of Shape’s ownership, borrowing, and lifetime system. It covers every rule, the compiler internals, the error repair engine, and the concurrency model. For the practical introduction, see References and Borrowing.

Design Goals

Shape’s ownership system targets three properties simultaneously:

Rust-level safety — no dangling references, no data races, no use-after-move.
Python-level ergonomics — no explicit lifetime annotations ('a), no unsafe, no Pin/Unpin.
Perfect diagnostics — every error includes a concrete code fix verified by re-running the solver.

The result: 7 concepts total (let/let mut/var, &/&mut, Copy/Clone, smart move/clone, NLL borrow checking, Mutex/Atomic/Lazy, repair solver) compared to Rust’s 30+.

The Binding System

Three Keywords

Keyword	Mutability	Ownership	Storage class
`let`	Immutable	Unique (move by default)	`Direct` (scalars) / `UniqueHeap` (heap payloads, typed `Arc<T>`)
`let mut`	Mutable	Unique	`Direct` (scalars) / `UniqueHeap` (heap payloads, typed `Arc<T>`)
`var`	Mutable	Inferred	Compiler-chosen — `Direct` / `UniqueHeap` / `SharedCow` / `SharedAtomic` / `SharedAtomicMut`

let and let mut bindings are uniquely owned — there is exactly one owner at a time, and assignment transfers ownership (a move). Mutations on let mut happen directly with zero overhead because the single-owner invariant means no coordination is needed. The storage class follows existing rules: scalars (int, number, bool, …) live in a stack-resident Direct slot, and heap-resident values (string, arrays, structs) use a single-owner UniqueHeap pointer — refcounting is reached only when escape (closure capture, cross-task share, store-into-shared-container) genuinely requires it.

var is the smart-default binding: it defers the storage-class choice to the compiler. The compiler walks the binding’s use-graph and picks the lightest policy that proves safe, choosing among five classes:

Observed usage	Inferred class	Runtime shape
Bound, read; never mutated, never escapes	`Direct` (immutable)	Stack, no allocation — equivalent to `let`
Mutated within the owning scope; doesn’t escape	`Direct` (mutable)	Stack mutable — equivalent to `let mut`
Escapes (closure / store / return); immutable	`UniqueHeap`	Heap pointer; refcount only when sharing is genuine
Mutated and shared (single-thread only)	`SharedCow`	Refcounted copy-on-write
Read-shared across a thread / task boundary	`SharedAtomic`	Atomic-refcounted, no lock
Mutated and shared across a thread / task boundary	`SharedAtomicMut`	Atomic-refcounted with a lock

Inference is conservative: when usage can’t be proven tighter, the compiler picks the most permissive class that is still correct. A var binding never produces a borrow-checker error — it always finds a class that works, even if that class is heavy. The borrow checker’s strictness is reserved for let / let mut. The chosen class is always surfaced as an LSP inlay hint, so it is visible and can be refactored:

var counter = 0          // Direct (stack-mutable)
var config = parse()     // UniqueHeap
var shared = make_buf()  // SharedCow — captured by closure
var queue  = Channel()   // SharedAtomicMut — shared across a spawned task

When inference is wrong or undesirable, the class can be pinned explicitly:

var x: SharedCow Array<int> = make_buf()   // explicit class hint
var y: Direct int = 0                      // pin to the stack

All three keywords produce equivalent MIR control flow after inference, but carry different binding metadata (aliasability, mutation capability) that drives storage decisions and borrow checking.

Smart Move/Clone Inference

For let and let mut, the compiler uses liveness analysis to determine when a move occurs (last use) vs when an explicit clone is needed:

At each Assign(dest, source) where source is non-Copy:
- Source not live after → Move (zero cost).
- Source live after → requires explicit clone (compile error if omitted).
If T is not Clone and source is live after → compile error with fix suggestion.

For var, the compiler automatically infers move vs clone from liveness:

var data = load_big_dataset()
var copy = data        // data used below → auto-clone
process(copy)
var moved = data       // data never used after → move

Copy Semantics

Primitive types are implicitly Copy — no borrow tracking needed:

Type	Copy?	Notes
`int`	Yes	Typed slot, 8 bytes (i64)
`number`	Yes	IEEE 754 f64, 8 bytes
`bool`	Yes	Typed slot, 1-byte payload
`none`	Yes	Singleton tag
`string`	No	Heap-allocated, typed `Arc<String>`
Arrays	No	Heap-allocated, typed `Arc<TypedArrayData>`
TypedObjects	Depends	Copy if all fields are Copy (auto-derived)

Clone Trait

Types must implement Clone to be auto-cloned. Clone is auto-derived for types whose fields are all Clone. The clone keyword and .clone() method are equivalent:

let a = [1, 2, 3]
let b = clone a        // keyword form
let c = a.clone()      // method form — identical

Reference Semantics

Place Model

The borrow checker tracks places — paths to memory locations:

Place ::= Local(SlotId)
        | Field(Place, FieldIdx)    // x.a
        | Index(Place, Operand)     // x[i]
        | Deref(Place)              // *r

Disjoint field borrows are tracked independently — borrowing x.a does not conflict with borrowing x.b. Index borrowing is supported (&arr[0]). Index borrows (x[i] vs x[j]) conservatively conflict (index disjointness analysis is deferred to v2).

Borrow Kinds

Kind	Syntax	Permissions	Aliasing
Shared	`&x`	Read only	Multiple allowed
Exclusive	`&mut x`	Read + Write	Exactly one, no shared overlap

Borrow Scoping (Non-Lexical Lifetimes)

Borrows are scoped to regions over CFG points, not lexical blocks. A borrow’s region extends from creation to last use:

let mut x = [1, 2, 3]
let r = &x             // loan L0 starts at CFG point P1
print(r.len())         // last use of r at P2 — L0 ends here
x.push(4)              // P3: no active loans, OK

The Datafrog solver propagates loans along CFG edges and detects conflicts at each point.

First-Class References

References can be stored in local variables:

let x = 42
let r = &x              // r holds a shared reference to x
let val = r + 1         // reads through r via DerefLoad

References cannot:

Be stored in struct fields (fields are always owned).
Outlive the scope of the borrowed value.

References can:

Be stored in local containers if the borrow provably outlives the container (parameter borrows in local containers).
return &x for local variables is properly detected by the MIR solver (no hard-coded rejection — the solver proves the reference would escape the referent’s region).

// Returning a value through a reference is fine:
fn read_val(&x) {
    return x            // DerefLoad: returns the *value*, not the reference
}

// Returning the reference itself is not:
fn bad(&x) {
    return &x           // ERROR B0003: reference would escape function scope
}

Auto-Referencing

Method calls implicitly create temporary borrows:

x.push(1)     // → (&mut x).push(1)
x.len()       // → (&x).len()
read(x)       // → read(&x) if param is &T

This is the single most important ergonomic feature — without it, users would write (&mut x).push(1) everywhere.

The MIR

Shape compiles AST → MIR → bytecode. The MIR is where all borrow analysis happens.

Structure

BasicBlock {
    statements: [Statement]
    terminator: Terminator
}

Statement ::= Assign(Place, Rvalue)
            | Borrow(Place, BorrowKind)   // emits loan_issued_at fact
            | Drop(Place)
            | Nop

Terminator ::= Goto(BlockId)
             | SwitchInt(Operand, [(i64, BlockId)], BlockId)
             | Return
             | Call { func, args, dest, next: BlockId }

Each Borrow statement generates a loan_issued_at(loan_id, point) fact. CFG edges generate cfg_edge(p1, p2). Assignments and drops generate invalidates(point, loan).

Liveness Analysis

The MIR includes a liveness pass that determines, for each variable at each point, whether it’s live (used later) or dead. This drives:

Smart move/clone — dead after assignment → move.
Borrow end points — a borrow’s region ends at the last point where the loan is used.
Drop insertion — drops are emitted at scope exit in reverse declaration order.

The Datafrog Solver

The borrow checker is implemented as a monotone fixed-point computation using Datafrog, the same engine used internally by Polonius (Rust’s next-gen borrow checker).

Input Relations

These are populated by MIR lowering:

Relation	Meaning
`loan_issued_at(Loan, Point)`	Loan L was created at point P
`cfg_edge(Point, Point)`	Control flow from P1 to P2
`use_of(Place, Point)`	Place P is used at point Q
`invalidates(Point, Loan)`	Point P invalidates loan L (e.g., drop or reassignment)

Derived Relations

Datafrog iterates to a fixed point over these rules:

loan_live_at(L, P) :- loan_issued_at(L, P).
loan_live_at(L, Q) :- loan_live_at(L, P), cfg_edge(P, Q), !invalidates(Q, L).

error(P, L1, L2)   :- loan_live_at(L1, P), loan_live_at(L2, P),
                       conflicts(L1, L2).

Typical convergence: 2-4 iterations. Complexity: O(|facts| x |iterations|). Target: < 10ms for 10K-line programs.

Conflict Rules

Two loans conflict if:

Both borrow the same place (or overlapping places).
At least one is exclusive (&mut).

Disjoint field borrows never conflict:

let mut obj = { a: 1, b: 2 }
let ra = &mut obj.a     // borrows Place::Field(obj, 0)
let rb = &mut obj.b     // borrows Place::Field(obj, 1) — no conflict

The Repair Engine

When the solver finds a borrow error, the repair engine generates candidate fixes, re-runs Datafrog on modified MIR, and presents the first passing candidate as a code diff.

Repair Strategies (tried in order)

Strategy	Description
REORDER	Move the conflicting statement after the last use of the blocking loan
SCOPE	Wrap the first borrow + its uses in `{ }` to limit extent
CLONE	Suggest `clone x` instead of `&x` (requires T: Clone)
DOWNGRADE	Change `&mut` to `&` if only reads exist
EXTRACT	Suggest extracting into a helper function (fallback)

Each candidate is verified by re-running the Datafrog solver on the modified MIR (~microseconds). The first passing candidate becomes the suggestion.

Error Output Format

error[B0001]: cannot borrow `x` as mutable — shared reference still active

  5 │ let r = &x        // shared borrow created
    │         ──
  6 │ let m = &mut x    // ERROR: conflicts with shared borrow
    │         ^^^^^^^^
  7 │ print(r)          // shared borrow still needed here
    │       ─

  fix: move the exclusive borrow after the last use of the shared borrow
       │ let r = &x
       │ print(r)
       │ let m = &mut x    // OK: shared borrow ended

LSP Integration

The repair engine powers several LSP features:

Borrow windows: Visual highlighting of borrow regions in the editor.
Inlay hints: Show clone/move/&/&mut annotations on var bindings.
Code actions: One-click application of repair suggestions.
Hover: Shows borrow state at any point (“x is borrowed shared by r until line 7”).

The LSP, compiler, and diagnostics all read from the same BorrowAnalysis result — there is zero duplicate logic.

No Explicit Lifetimes

Shape eliminates 'a syntax entirely. References are scoped borrows that can’t be stored in structs. The solver infers everything.

What Rust needs lifetimes for vs Shape’s answer:

Rust Pattern	Shape Answer
Structs storing references	Fields are always owned (typed slot or typed `Arc<T>`). No refs in structs.
Functions returning refs from multiple inputs	Elision: return borrows from `self` or single `&` input. Ambiguous → error.
Generic lifetime bounds	Not needed (no raw pointers, no unsafe).

// Works — solver infers return borrows from &self
fn first(&self) -> &int { return self.items[0] }

// Error — ambiguous
fn pick(&a: [int], &b: [int]) -> &int { return a[0] }
// error: ambiguous borrow — return could reference `a` or `b`
//   fix: return an owned value instead

Concurrency Model

Shape doesn’t need Send/Sync because of three properties:

No interior mutability — &T is always truly read-only.
Arc everywhere — all heap values use atomic reference counting.
Structured concurrency — tasks can’t outlive parent scope.

Three Rules

What crosses task boundary	Structured task	Detached task	Why safe
Owned value (move/clone)	Always allowed	Always allowed	No aliasing
`&T` (shared ref)	Allowed	Forbidden	Structured: scope-bounded; Detached: could outlive referent
`&mut T` (exclusive ref)	Forbidden	Forbidden	Would create aliased mutation

let mut data = [1, 2, 3]

// OK: owned value (clone) — at HEAD, hoist the clone to a let-binding
// because the `clone` keyword does not yet parse inside a call-argument
// position (`process(clone data)` is rejected; the rewritten form below
// works):
let cloned = clone data
async let r1 = process(cloned)

// OK: shared reference in structured child task
async let r2 = read_only(&data)

// ERROR: shared ref cannot cross detached task boundary
// detach r2b = read_only(&data)

// ERROR C0001: exclusive reference cannot cross task boundary
async let r3 = modify(&mut data)

Concurrency Primitives

For shared mutable state, Shape provides three compiler-builtin types. These are the only types with interior mutability — no user-definable interior mutability exists.

Mutex<T>

let m = Mutex(0)
m.lock()               // acquire, returns inner value
m.try_lock()           // non-blocking, returns value or none
m.set(42)              // acquire and replace

Atomic<T>

Integer atomics with sequential consistency:

let a = Atomic(0)
a.load()               // read current value
a.store(42)            // write new value
a.fetch_add(1)         // add and return previous
a.fetch_sub(1)         // subtract and return previous
a.compare_exchange(expected, new)  // CAS

Lazy<T>

Thread-safe lazy initialization:

let cache = Lazy(|| expensive_computation())
cache.get()            // initializes on first call, cached thereafter
cache.is_initialized() // check without triggering init

What Shape Eliminates

By not being a systems language, Shape eliminates ~20 Rust concepts:

Eliminated	Why Rust needs it	Why Shape doesn’t
Explicit lifetimes (`'a`)	Zero-copy, self-ref structs	Fields always owned.
`unsafe` keyword	Granular unsafe blocks	FFI module boundary is the trust boundary.
Pin/Unpin	Self-referential async futures	Higher-level async model.
PhantomData	Variance with raw pointers	No raw pointers.
Sized/?Sized	DSTs, trait objects	Uniform typed-slot layout (8 bytes).
Box<T>	Explicit heap allocation	Heap payloads use typed `Arc<T>`.
Cow<T>	Manual clone-on-write	Smart move/clone handles this.
Multiple string types	FFI, zero-copy	One string type.
AsRef/Borrow/ToOwned	Owned/borrowed conversion	Smart move/clone + auto-ref.
Deref/DerefMut	Smart pointer ergonomics	No custom smart pointers.
Send/Sync traits	Interior mutability, Rc vs Arc	Three rules.
ManuallyDrop/MaybeUninit	Low-level memory	Automatic RAII.

FFI Safety

Shape has no unsafe keyword. Instead, dangerous operations (ptr, alloc, deref) are restricted to FFI modules — files containing extern C fn declarations. Normal Shape code cannot access raw pointers. The file boundary is the trust boundary.

// ffi_bindings.shape — FFI module (dangerous ops allowed)
extern C fn malloc(size: int) -> ptr from "libc";
extern C fn free(p: ptr) -> void from "libc";

// main.shape — normal module (dangerous ops NOT allowed)
// let p = alloc(8)    // ERROR: ptr/alloc only allowed in FFI modules

Implementation Architecture

Single Source of Truth

The borrow checker, type inference, move/clone analysis, and repair engine are implemented once. The compiler, LSP, and diagnostics consume the same BorrowAnalysis result:

                    ┌──────────────┐
   AST ──────────►  │  MIR Lowering │
                    └──────┬───────┘
                           │
                    ┌──────▼───────┐
                    │   Datafrog   │
                    │ Borrow Solver│
                    └──────┬───────┘
                           │
              ┌────────────┼────────────┐
              │            │            │
       ┌──────▼──┐  ┌──────▼──┐  ┌──────▼──────┐
       │Compiler  │  │  LSP    │  │ Diagnostics  │
       │(codegen) │  │(hints/  │  │ (errors/     │
       │          │  │ hover)  │  │  repairs)    │
       └──────────┘  └─────────┘  └─────────────┘

If the compiler says “this is a move,” the LSP hint says “this is a move.” They read the same data. There is no scenario where they disagree.

Implementation Phases

The system was built in 7 phases:

Phase	Name	Key Deliverable
0	Surface Syntax	`let mut`, `&mut`, `var`, `move`/`clone` keywords
1	Binding Resolver	Immutability enforcement, shadowing rules
2	MIR Lowering	Place/BasicBlock/Terminator IR, CFG, liveness
3	Datafrog Solver	NLL borrow checking with loan propagation
3.5	Smart Move/Clone	Liveness-based inference, Clone trait checks
4	First-Class Refs	`let r = &x`, scoped ref return, auto-ref
5	Repair Engine	Candidate generation, Datafrog re-runs, code diffs
6	Concurrency	Three rules, Mutex/Atomic/Lazy builtins