Case study 01Ready for review

OrgX

Continuity infrastructure for AI agents: persistent organizational memory, trust scoring, decision provenance, and MCP tooling across Claude Code, Cursor, and ChatGPT.

OrgX is the clearest expression of how I think about agent infrastructure. Delegate aggressively, keep provenance visible, and make the review surface strong enough that human judgment can stay precise instead of becoming a bottleneck.

Commits1,270+

MCP tools61

Tool categories16

Platform repos12

Benchmark tasks136+

Public essays8

OrgX live command center — real product surface, not a marketing illustration

Installnpx @smithery/cli install @useorgx/orgx-mcp --client claudeView on Smithery ↗

FrontendNext.js, React, TypeScript

WorkflowInngest durable functions

DataSupabase + PostgreSQL

ObservabilityPostHog, Sentry, OpenTelemetry

01 // technical depth

The boring infrastructure choices that matter

The public narrative around MCP today is "connect a tool to a model." That's table stakes. The harder question is: how do you run an MCP server that production teams will actually trust with real organizational state?

OrgX MCP is built on Cloudflare Workers + Durable Objects because session isolation, SQLite persistence, and cross-deploy state survival matter more than easy scaling. Auth uses OAuth 2.1 with dynamic client registration — no API keys, no shared secrets — because every serious MCP client is going to need this in six months anyway.

OAuth 2.1 + PKCE + DCR: credentials never touch environment variables.
Durable Object SQLite: session and workspace state survive deploys.
Ed25519-signed registry: domain verification, automated release pipeline.
Context pointers, not embeds: entities reference URLs and other entities without payload bloat.
Inngest over Temporal: workflow durability without a separate orchestration cluster.
Supabase over custom backend: real-time subscriptions, auth, velocity — one system.

orgx-mcp internals

auth
  OAuth 2.1 + PKCE
  dynamic client registration via POST /register
  no API keys in env — creds in Durable Object SQLite

state
  Durable Objects for session isolation
  cross-deploy persistence across Worker restarts
  context[] JSON pointers on every entity

registry
  Ed25519-signed domain verification
  automated release pipeline to Smithery

transport
  HTTP streaming + SSE fallback
  MCP Apps widget rendering for compatible hosts

02 // system frame

What the platform actually does

OrgX treats the organization, not the single prompt, as the primitive. Tasks turn into workflows. Agents become operating actors with tool access, trust boundaries, cost footprints, and quality history.

The value is not just that the system can delegate. The value is that delegation stays legible: you can see what was routed, which tools were used, what quality bar was applied, and where a human stepped in.

Spawn and route specialist agents without losing context.
Gate autonomy through trust tiers instead of hard-coded fear or blind freedom.
Carry decisions, learnings, and costs forward through org memory and outcome attribution.

trust tiers

strict   -> read, analyze, propose; no mutation without approval
balanced -> low-risk mutation allowed; medium risk escalates
open     -> full autonomy inside tool and budget guardrails

spawn guards
  max_agents_per_task: enforced
  budget_per_run: enforced
  human escalation: required for destructive actions

03 // architecture

Three layers keep the platform honest

OrgX architecture

Events drive durable execution, governance decides what can happen next, and the intelligence layer records quality, cost, and memory.

04 // visual proof

The product surface is part of the system

OrgX command center dashboard — Primary operator view: the command center that makes active work, review state, and orchestration legible.

OrgX live — the full agents grid and next-up queue with named specialists, activity timeline, and queued decisions — Production density: 6 named specialist agents (Mark, OrgX, Eli, Holt, Claude Code, Kimi), activity timeline with 20+ blocked and 1 decisions, and a next-up queue with plugin packaging, ads campaigns, and account management. The product surface earns the “live” label.

OrgX MCP agent status screen — MCP proof surface showing agent status and system activity rather than abstract platform claims.

05 // widgets

Six production MCP Apps widgets — UI that renders inside any MCP host

Every OrgX widget ships typed data contracts, demo / loading / empty states, and SSR-safe rendering so an MCP host can embed them in a chat surface or dashboard without custom code. These are real demos, served live at mcp.useorgx.com/widgets ↗.

Initiative Pulse ↗

Health, ROI, and blockers in a single conversational card. 78 health score, +431% ROI, 1 blocker surfaced with resolution path.

Agent Status ↗

Live status of every specialist agent. Current focus, tasks in flight, blocked/review signals, and the “needs you” prompts that drive human review.

Morning Brief ↗

Daily executive summary. Top priorities by domain, decisions made, receipts, and what needs attention — generated and citable.

Pending Decisions ↗

Interactive approval surface with approve / revise actions, cost estimates, and countdown-to-auto-approve. The UI that closes the human-in-the-loop.

Scaffolded Initiative ↗

Work-breakdown structure rendered from a typed scaffold event. Workstreams, milestones, and tasks with live status badges and owners.

Search Results ↗

Unified retrieval across artifacts, decisions, and initiatives with relevance scores. The shape of memory, visible.

06 // benchmark

Single-shot benchmarks hide what agents can't fake

Most agent benchmarks are single-shot: one prompt, one model, one task. That structurally hides the thing that actually breaks in production — cascading context across sessions, tools, and time.

The OrgX benchmark evaluates cross-session continuity, memory handoff, and decision provenance across real initiatives. Results, judgments, token costs, and failure cases are published openly.

Read the methodology + published runs ↗

autonomous-initiative-benchmark

tasks               136+
domains             7 (product, eng, ops, design, ...)
execution modes     3 (agent, api, cli)
judges              independent LLM panels
artifacts           task, judgment, cost, token logs
failure cases       published with human review notes

methodology         useorgx.com/blog
next run            more inclusive — more models,
                    more domains, broader substrate

07 // ecosystem

12 repos, one coherent platform

orgx-mcp ↗ — 61 MCP tools · Cloudflare Workers + Durable Objects · OAuth 2.1 + DCR

orgx-gateway-sdk ↗ — Gateway Protocol v1 client SDK

openclaw-plugin ↗ — 30-tool MCP + browser mission control

orgx-claude-code-plugin ↗ — Claude Code runtime integration

cursor-plugin ↗ — MCP, rules, skills, hooks, commands

orgx-codex-plugin ↗ — Codex + initiative-aware skills

orgx-opencode-plugin ↗ — OpenCode peer driver

orgx-ui-kit ↗ — React components + tokens

orgx-data ↗ — typed contracts + React hooks

orgx-local-shell ↗ — Tauri 2 desktop app

skills ↗ — reusable agent skill library

autonomous-initiative-benchmark ↗ — public methodology + catalog

08 // try it

Install OrgX MCP in a compatible client

The MCP server exposes 61 tools for org memory, planning, decisions, scoring, and workspace management. It works in Claude Desktop, Claude Code, Cursor, and any compatible MCP client.

Dynamic client registration handles auth. Durable Objects handle state survival across deploys. You get working context in the first session.

mcp.useorgx.com ↗View on Smithery ↗

install

# Smithery (recommended)
npx @smithery/cli install @useorgx/orgx-mcp --client claude

# Direct MCP config
{
  "mcpServers": {
    "orgx": {
      "url": "https://mcp.useorgx.com"
    }
  }
}

OrgX is the system I’d build again from scratch tomorrow.

If you're hiring for agent infrastructure, agent platforms, AI developer productivity, or MCP tooling — this is a preview of the work I'll ship on your team from day one.

For hiring managers View live platform