§ 01 / 05

Statusall systemsliveRefresh cadence · 60s · annotated by SRE on duty.

page · refreshed 14s ago

exAI Agentic OS · service posture

All systems operational.

Status updates land every 60s. RSS feed, email digest, and Slack webhook subscriptions are all live.

Subscribe via RSS for raw events, email for human-readable digests, or write directly to status@exai.cloud to be added to the on-call broadcast list. The page tells you, first, whether it’s us or you.

Subscribe via RSS Subscribe via email

Posture · last 90d

rolling · live

Overall posture

7 of 8 services green. Audit-log streaming on watch.

Uptime

0.000%

rolling 90 days

P1 events

last 90 days

P2 events

last 90 days

Maintenance windows

last 90 days

SLA · 99.99% contractual

● within budget

Fig. 01 · ninety-day posturerefresh · 60s

§ 02 / 05

Service · breakdown

Service-by-service.
Reading the same dashboard SRE reads.

Eight surfaces. Each row reports current month uptime, status, and the last filed incident date. Same numbers our on-call engineer sees on the wall.

Surface

Status

Uptime · MTD

Last incident

Workspace

Operational

uptime · current month

0.000%

last incident

Apr 14

Composer

Operational

uptime · current month

0.00%

last incident

Mar 22

Builder

Operational

uptime · current month

0.000%

last incident

Jan 16

Orchestrator

Operational

uptime · current month

0.000%

last incident

Feb 02

API

Operational

uptime · current month

0.000%

last incident

Mar 22

Webhook bus

Operational

uptime · current month

0.00%

last incident

Mar 22

Audit log streaming

Splunk shipping lag in eu-west-1 — see incident below

Degraded

uptime · current month

0.00%

last incident

ongoing

Authentication (SAML / SCIM)

Operational

uptime · current month

0.000%

last incident

Nov 28

7 operational1 degradedsignals · synthetic + real-user · 12-region probe mesh

§ 03 / 05

Incidents · active & recent

What’s on fire now.
And what just was.

One open incident, owner-attributed. Plus the four most recent filings inside the 90-day window — every entry links to its annotated post-mortem.

Active · ongoing

Incident · INC-2026-0428-01

Audit log streaming · Splunk shipping lag (eu-west-1)

Customer-managed Splunk HEC endpoint in eu-west-1 returning backpressure. Audit events are buffering and being shipped via the S3 fallback path. No event loss; delivery delay is currently under 90 seconds.

opened · Apr 28 · 14:22 UTCelapsed · 23m

Timeline · UTC

14:22detected— eu-west-1 Splunk HEC backpressure

14:38mitigation— switched to S3 fallback

14:45monitoring— primary recovering

14:45ongoing— next update in 15m

Subscribe to updates →owner · sre · k.mori

Recent incidents · last 90d4 filings

Apr 14P3

Composer router timeout (eu-west-1)

MTTR

post-mortem →

Mar 22P2

SCIM webhook delivery delayed

MTTR

post-mortem →

Feb 02P3

Prebuild cache eviction

MTTR

post-mortem →

Jan 16P3

Audit log Splunk lag

MTTR

post-mortem →

0 P1 · 1 P2 · 3 P3 · last 90 daysfull incident archive →

§ 04 / 05

Regions · live posture

Five regions. Five realities.

Latency, prebuild warm time, and cold-start are measured per region from in-region synthetic probes — refreshed every 60s. Model provider availability rolls up Anthropic, OpenAI, and Google upstream status.

Region · 01

us-east-1

Operational

API latency · P500ms

Prebuild · warm0ms

Workspace · cold-start0ms

Models · provider availability

Anthropic

OpenAI

Google

Region · 02

us-west-2

Operational

API latency · P500ms

Prebuild · warm0ms

Workspace · cold-start0ms

Models · provider availability

Anthropic

OpenAI

Google

Region · 03

eu-west-1

Degraded

API latency · P500ms

Prebuild · warm0ms

Workspace · cold-start0ms

Models · provider availability

Anthropic

OpenAI

Google

Region · 04

eu-central-1

Operational

API latency · P500ms

Prebuild · warm0ms

Workspace · cold-start0ms

Models · provider availability

Anthropic

OpenAI

Google

Region · 05

ap-southeast-1

Operational

API latency · P500ms

Prebuild · warm0ms

Workspace · cold-start0ms

Models · provider availability

Anthropic

OpenAI

Google

4 operationaleu-west-1 · degraded · audit-log dispatch backpressureupstream · model providers · live

All systems operational.

Service-by-service.Reading the same dashboard SRE reads.

What’s on fire now.And what just was.

Audit log streaming · Splunk shipping lag (eu-west-1)

Five regions. Five realities.

Scheduled change.No surprises.

Get told first. Quietly.

Service-by-service.
Reading the same dashboard SRE reads.

What’s on fire now.
And what just was.

Scheduled change.
No surprises.