§ 01 / 06

Orchestrator · long-running runsTyped DAG · Firecracker microVMs · 60s checkpoints

runtime · v2026.04 · production

The runtime that owns long-running agent work

30-hour runs.
Replayable byte-identical.

Hand the Orchestrator a monorepo migration, a framework upgrade, a compliance sweep. A typed DAG of 26 specialized agents runs inside isolated microVMs and shows its work at every hop.

Pause it. Resume it next week. Replay it byte-identical from the ledger. Every checkpoint a receipt, every gate a signature, every retry a reason.

See a run end-to-end Book a long-run pilot

run · monorepo-to-turbo · #4827

live

DAG status · 4 of 7 layers

INTAKE✓ 8s

DEPS · SCHEMA✓ 14m

TESTS · vm-fan● 4h 12m

MIGRATE · MERGEqueued

Max run

Checkpoint

Nodes

0/22

Fig. 01 · live run · #4827checkpoint 12s ago

§ 02 / 06

Anatomy of a run

Intake to merge.
One graph.

The Orchestrator plans the work as a typed DAG, fans out to a warm microVM pool, snapshots state every 60 seconds, parks at human gates, and resumes byte-identical from the ledger. Below: a real run, edited for readability.

Run · monorepo-to-turbo · #4827

running · 18h 42m14 / 22 nodescheckpoint 12s agobudget 48% used

Stream · vm-08 → vm-12

18:41:02 Orchestrator fan-out · 12 shards · pool warm vm-08…vm-19

18:41:18 vm-08 jest packages/core · 1,142 passing · cache 92%

18:42:04 vm-11 playwright apps/web · 94/96 · 2 retry · stable

18:42:11 Checkpoint snapshot 18h42m · sha:a8f2c1…e1 · 1.4 GiB

18:42:29 vm-12 jest packages/billing · 412 passing · 0 flake

18:43:01 Gate · MERGE parking · awaiting approval · slack pinged

§ 03 / 06

Engineering invariants

Five rules that
hold at hour 19.

Long-running agent work fails predictable ways: untyped graphs that deadlock, lost state on restart, sequential dispatch, missing approval surfaces, runaway spend. The Orchestrator answers each one as a first-class invariant.

01
Typed DAG.
Every agent declares inputs, outputs, and effects. The scheduler rejects impossible graphs at plan time — not at hour 19.
02
Durable checkpoints.
State, VM disk, and the tool-call ledger snapshot to object storage every 60 seconds. Crash today, resume Thursday.
03
Parallel by default.
Independent nodes fan out across a warm Firecracker pool. Wall-clock hours, not days.
04
Human-in-the-loop gates.
Mark any node approval-required. Orchestrator parks, pings Slack with the diff, resumes — hours or weeks later.
05
Receipted budgets.
Per-run token + compute ceilings with live receipts. Aborts cleanly before overshoot, rolls back to last gate.

nodes/migrate-package.ts

TypeScript · strict

import { defineNode } from "@exai/orchestrator";
import { z } from "zod";

export const migratePackage = defineNode({
  name: "migrate-package",
  inputs: {
    repo: z.string(),
    package: z.string(),
    from: z.string(),
    to: z.string(),
  },
  outputs: {
    diff: z.string(),
    coverage: z.number(),
  },
  effects: ["fs", "net"],
  budget: { tokens: 2_000_000, compute_h: 2 },
  retry: { max: 3, backoff: "exp" },
  gate: "merge-approval",
  async run({ input, vm, log }) {
    const result = await vm .codemod(input.from, input.to);
    log.checkpoint("codemod-applied");
    return result;
  },
});

Fig. 03 · typed node declaration~ 24 LoC

§ 04 / 06

What platform teams point it at

Hand it the work
nobody wants overnight.

Four shapes of long-running work that the Orchestrator owns end-to-end. None of it requires a babysitter. All of it produces a PR, a binder, or a receipt at the other side.

01 · Monorepo migrationsunattended

packages avg

Turbo, Nx, pnpm, npm workspaces. P50 wall-clock 18h 42m, fully unattended, single PR per package.

02 · Framework upgradesunattended

P50 wall-clock

React 18 → 19, Next 14 → 15, Rails 7 → 8. Codemods + manual escapes + e2e per package boundary.

03 · Compliance sweepsunattended

0×

per year

SOC 2 evidence quarterly, unattended. Pulls audit logs, runs control attestations, files the binder.

04 · Flake-huntingunattended

0×

rerun · bisect

Re-run a flaky suite a thousand times, bisect to first failing commit, open the PR with the repro.

§ 05 / 06

Budgets · receipts · clean abort

Spend has a ceiling.
Receipts ship hourly.

Every run is bounded. Per-node tokens, compute-hours, and model spend are declared up front. The Orchestrator emits a streaming receipt, aborts cleanly before overshoot, and rolls back to the last checkpoint with a signed reason.

The Orchestrator treats budget like a typed input. You declare ceilings on the run — total tokens, total compute-hours, total model spend — and the scheduler refuses to dispatch a node whose worst-case cost would breach the envelope.

Receipts stream hourly to your finance webhook with the actual tokens-by-model breakdown, the wall-clock per VM, and the per-node line item. When a cap is approached the Orchestrator drains the work in flight, snapshots state, and aborts cleanly — no zombie shards, no orphaned VMs.

Pre-flight cost estimation per node, refused before dispatch.
Hourly streaming receipts to webhook · Slack · S3.
Hard ceiling triggers drain → snapshot → clean abort.
Resume from the last gate with a fresh budget on rerun.

receipt · run #4827 · 18h 42m

cap $50.00 · 30h

Live spend ledger

Tokens$0.00 / of 20.00

42% used · ceiling enforced at 100%

Compute hours$0.00 / of 22.00

51% used · ceiling enforced at 100%

Model spend$0.00 / of 8.00

57% used · ceiling enforced at 100%

Tokens · by model

claude-opus-4.74.21M

claude-sonnet-4.53.04M

haiku-3.71.17M

VMs · wall-clock

vm-08…vm-1911.18 h

checkpoints1,124

retries4

$0.00 of $0.00 · 48% used

within ceiling

Fig. 05 · streaming receipt · run #4827signed · webhook delivered

§ 06 / 06

The Orchestrator · final cut

Hand it a 30-hour
problem. Sleep.

Typed DAG. Sixty-second checkpoints. Parallel by default. Human-in-the-loop gates. Receipted budgets. The runtime that owns long-running agent work — built for the platform engineers actually shipping migrations at scale.

Book a long-run pilot Replay a run

SOC 2 Type IIISO 27001HIPAA-readyGDPR · DPFPCI DSS 4.0BYOC · on-prem

30-hour runs.Replayable byte-identical.

Intake to merge.One graph.

Five rules thathold at hour 19.

Hand it the worknobody wants overnight.