v2026.04
Read release notes
exAI Agentic OSexAI
§ 01 / 06
Orchestrator · long-running runsTyped DAG · Firecracker microVMs · 60s checkpoints
runtime · v2026.04 · production
The runtime that owns long-running agent work

30-hour runs.
Replayable byte-identical.

Hand the Orchestrator a monorepo migration, a framework upgrade, a compliance sweep. A typed DAG of 26 specialized agents runs inside isolated microVMs and shows its work at every hop.

Pause it. Resume it next week. Replay it byte-identical from the ledger. Every checkpoint a receipt, every gate a signature, every retry a reason.

run · monorepo-to-turbo · #4827
live
DAG status · 4 of 7 layers
INTAKE✓ 8s
DEPS · SCHEMA✓ 14m
TESTS · vm-fan● 4h 12m
MIGRATE · MERGEqueued
Max run
0h
Checkpoint
0s
Nodes
0/22
Fig. 01 · live run · #4827checkpoint 12s ago
§ 02 / 06
Anatomy of a run

Intake to merge.
One graph.

The Orchestrator plans the work as a typed DAG, fans out to a warm microVM pool, snapshots state every 60 seconds, parks at human gates, and resumes byte-identical from the ledger. Below: a real run, edited for readability.

Run · monorepo-to-turbo · #4827
running · 18h 42m14 / 22 nodescheckpoint 12s agobudget 48% used
INTAKE✓ 8sDEPS✓ 2m14sSCHEMA✓ 8m02sBUILD✓ 14mLINT✓ 22mTESTS● 4h12mMIGRATEpendingE2EpendingMERGEHITL gate
Stream · vm-08 → vm-12
18:41:02 Orchestrator fan-out · 12 shards · pool warm vm-08…vm-19
18:41:18 vm-08 jest packages/core · 1,142 passing · cache 92%
18:42:04 vm-11 playwright apps/web · 94/96 · 2 retry · stable
18:42:11 Checkpoint snapshot 18h42m · sha:a8f2c1…e1 · 1.4 GiB
18:42:29 vm-12 jest packages/billing · 412 passing · 0 flake
18:43:01 Gate · MERGE parking · awaiting approval · slack pinged
§ 03 / 06
Engineering invariants

Five rules that
hold at hour 19.

Long-running agent work fails predictable ways: untyped graphs that deadlock, lost state on restart, sequential dispatch, missing approval surfaces, runaway spend. The Orchestrator answers each one as a first-class invariant.

  • 01
    Typed DAG.
    Every agent declares inputs, outputs, and effects. The scheduler rejects impossible graphs at plan time — not at hour 19.
  • 02
    Durable checkpoints.
    State, VM disk, and the tool-call ledger snapshot to object storage every 60 seconds. Crash today, resume Thursday.
  • 03
    Parallel by default.
    Independent nodes fan out across a warm Firecracker pool. Wall-clock hours, not days.
  • 04
    Human-in-the-loop gates.
    Mark any node approval-required. Orchestrator parks, pings Slack with the diff, resumes — hours or weeks later.
  • 05
    Receipted budgets.
    Per-run token + compute ceilings with live receipts. Aborts cleanly before overshoot, rolls back to last gate.
nodes/migrate-package.ts
TypeScript · strict
import { defineNode } from "@exai/orchestrator";
import { z } from "zod";

export const migratePackage = defineNode({
  name: "migrate-package",
  inputs: {
    repo: z.string(),
    package: z.string(),
    from: z.string(),
    to: z.string(),
  },
  outputs: {
    diff: z.string(),
    coverage: z.number(),
  },
  effects: ["fs", "net"],
  budget: { tokens: 2_000_000, compute_h: 2 },
  retry: { max: 3, backoff: "exp" },
  gate: "merge-approval",
  async run({ input, vm, log }) {
    const result = await vm .codemod(input.from, input.to);
    log.checkpoint("codemod-applied");
    return result;
  },
});
Fig. 03 · typed node declaration~ 24 LoC
§ 04 / 06
What platform teams point it at

Hand it the work
nobody wants overnight.

Four shapes of long-running work that the Orchestrator owns end-to-end. None of it requires a babysitter. All of it produces a PR, a binder, or a receipt at the other side.

01 · Monorepo migrationsunattended
0
packages avg

Turbo, Nx, pnpm, npm workspaces. P50 wall-clock 18h 42m, fully unattended, single PR per package.

02 · Framework upgradesunattended
0h
P50 wall-clock

React 18 → 19, Next 14 → 15, Rails 7 → 8. Codemods + manual escapes + e2e per package boundary.

03 · Compliance sweepsunattended
0×
per year

SOC 2 evidence quarterly, unattended. Pulls audit logs, runs control attestations, files the binder.

04 · Flake-huntingunattended
0×
rerun · bisect

Re-run a flaky suite a thousand times, bisect to first failing commit, open the PR with the repro.

§ 05 / 06
Budgets · receipts · clean abort

Spend has a ceiling.
Receipts ship hourly.

Every run is bounded. Per-node tokens, compute-hours, and model spend are declared up front. The Orchestrator emits a streaming receipt, aborts cleanly before overshoot, and rolls back to the last checkpoint with a signed reason.

The Orchestrator treats budget like a typed input. You declare ceilings on the run — total tokens, total compute-hours, total model spend — and the scheduler refuses to dispatch a node whose worst-case cost would breach the envelope.

Receipts stream hourly to your finance webhook with the actual tokens-by-model breakdown, the wall-clock per VM, and the per-node line item. When a cap is approached the Orchestrator drains the work in flight, snapshots state, and aborts cleanly — no zombie shards, no orphaned VMs.

  • Pre-flight cost estimation per node, refused before dispatch.
  • Hourly streaming receipts to webhook · Slack · S3.
  • Hard ceiling triggers drain → snapshot → clean abort.
  • Resume from the last gate with a fresh budget on rerun.
receipt · run #4827 · 18h 42m
cap $50.00 · 30h
Live spend ledger
Tokens$0.00 / of 20.00
42% used · ceiling enforced at 100%
Compute hours$0.00 / of 22.00
51% used · ceiling enforced at 100%
Model spend$0.00 / of 8.00
57% used · ceiling enforced at 100%
Tokens · by model
claude-opus-4.74.21M
claude-sonnet-4.53.04M
haiku-3.71.17M
VMs · wall-clock
vm-08…vm-1911.18 h
checkpoints1,124
retries4
$0.00 of $0.00 · 48% used
within ceiling
Fig. 05 · streaming receipt · run #4827signed · webhook delivered
§ 06 / 06
The Orchestrator · final cut

Hand it a 30-hour
problem. Sleep.

Typed DAG. Sixty-second checkpoints. Parallel by default. Human-in-the-loop gates. Receipted budgets. The runtime that owns long-running agent work — built for the platform engineers actually shipping migrations at scale.

SOC 2 Type IIISO 27001HIPAA-readyGDPR · DPFPCI DSS 4.0BYOC · on-prem