The Claim Graph: Why Typed Provenance Beats RAG for Agent Systems
RAG retrieves text. A claim graph retrieves decisions, their justifications, and who approved them. For agent systems, the second one is what actually matters.
Durable Objects for Agent Memory: A Cloudflare Pattern
If your MCP server's session state disappears on deploy, you haven't finished building it. Here's the pattern we use in OrgX to keep it alive.
We Ran 136 Agent Tasks. Here's What Single-Shot Benchmarks Hide.
Most agent benchmarks are one prompt, one model, one score. That structurally conceals the thing that breaks in production: cascading context across sessions.
Trust Models for Agent Autonomy: When to Let Agents Act Alone
A practical framework for strict, balanced, and open autonomy, with trust earned through evidence instead of granted by prompt.
Why Most Agent Frameworks Solve the Wrong Problem
Most agent frameworks optimize the prompt loop. Production agent infrastructure has to optimize governance, durability, memory, and review.
What I Learned Running MCP in Production
What broke, what patterns held up, and what I would do differently after shipping the OrgX MCP server — 61 tools across 16 categories — plus integrations with ~20 external MCP services.
Building a Homebrew-Installable Dev Tool in Rust: The PerfPulse Story
Why Rust for CLI tools, cross-platform M1/Intel builds, Homebrew tap distribution, and integrating Claude API for AI-powered recommendations.