Skip to content

Scrum for AI Agent Teams: A Field Report from a Live Operating Model

Scrum for AI agent teams

Scrum for AI Agent Teams: A Field Report from a Live Operating Model

A keynote preview ahead of the Agile Forum Costa Rica 2026 — and a direct answer to the question every executive is now asking: what does our work system look like when the machine can think?

On May 7, 2026, Dr. Jeff Sutherland, co-creator of Scrum, will deliver an online keynote to the Agile Forum Costa Rica 2026 — Rethinking Organizational Transformation. The talk, “When the Machine Can Think,” lays out a working blueprint for Scrum for AI agent teams — the same operating model that runs his lab today. In other words, this is not theory. It is a field report. Above all, it answers a practical question: how does Scrum for AI agent teams actually run, day to day, when the agents themselves do most of the work? This post is the written companion to that keynote.

Cuando la máquina puede pensar — de copilotos a sistemas de entrega acelerados por IA.

Four Years to Here

The honest way to talk about AI agents in 2026 is to back up and walk forward.

First, in 2023, the team was six people: one human and five AI agents working through JetBrains and GitHub Copilot. The AI ran at roughly IQ 100. It was slow. Moreover, it was painful. Yet it was already 30 times faster than a team of humans. As a result, that year, all human coding stopped.

Next, in 2024, Claude became the lead programmer. AI moved to roughly IQ 130. Velocity rose another 5×. Consequently, the test framework was torn out, and the team switched to acceptance-test-driven development. After all, when your reviewer can read intent, you stop spending engineering effort proving syntax.

Then in 2025, AI crossed roughly IQ 150. Bugs were almost all requirement bugs. Prompts disappeared. Instead, the AI behaved like a Ph.D. colleague — describe the problem, and it asks the questions a senior engineer would ask. Thus, the work moved upstream, into specification.

Finally, in 2026, Nature reported that AGI is here. AI is past IQ 160. Meanwhile, OpenClaw is the fastest-growing software project in history. Autonomous agents have taken over the work, and the operating model — not the model — is now the bottleneck.

The goal that pulled the lab through those four years is the same one defended on stage. Specifically: 1000× velocity on a single Mac Studio, at less than 10% of the token cost per story point of any enterprise AI system, with 10× the quality. Above all, Scrum and Scrum@Scale are how you get there.

The Pace Changed

A year ago, model updates landed every couple of months. As a result, teams adapted by quarter — pilots, governance reviews, slow rollouts. By contrast, this week, Hermes, OpenClaw, Claude, and GPT-5.5 are shipping changes daily. Consequently, transformation is no longer a project. Instead, it is a morning routine.

Therefore, the bottleneck is no longer access to intelligence. Rather, the bottleneck is your work system. Whoever upgrades, verifies, and redeploys fastest learns fastest. Meanwhile, everyone else is funding pilots that are obsolete the day they launch.

What Scrum for AI Agent Teams Actually Looks Like

People constantly ask what an “AI operating model” looks like in practice. Mine has four layers, and the simplest way to read them is from the bottom up.

  • Models — Claude, GPT-5.5, DeepSeek, Grok. Interchangeable workers and reviewers. Engines, not the operating model.
  • OpenClaw — the multi-agent execution layer. Routes work to the right agent and bridges Slack and Mission Control.
  • Hermes — the personal/operator agent. Coordinates, audits, retrieves memory, and patches health issues.
  • Mission Control — the Scrum board for agents. Backlog, WIP limits, review gates, leaderboard.

Models are engines. Mission Control, Hermes, and OpenClaw are the car, the dashboard, the brakes, and the pit crew. Switch the model, and the car keeps driving. However, switch the operating model, and the car runs into a wall.

The Daily Scrum, for Machines

In Scrum for AI agent teams, the daily loop runs every morning, in this order:

  1. Upgrade all systems.
  2. Health-check gateways, Slack, agents, tokens, launchd, Docker.
  3. Fix breakages immediately.
  4. Pull top-priority work from Mission Control.
  5. Enforce WIP = 1 per agent.
  6. Require deliverables and evidence before review.
  7. Run Grok or Product Owner review gates before “done.”
  8. Publish leaderboard and value report at end of day.

In short, this is DevOps, Scrum, and agent governance fused into a single rhythm. Furthermore, you cannot pull work into a system you have not first verified is healthy.

Field Example: Bringing #henry Back Online

Yesterday, the Slack route to my agent #henry failed. As a result, Henry went silent. The diagnosis was a gateway version mismatch and token drift. Then came the repair: rebuild the Docker gateway, upgrade Henry, and fix the LaunchDaemon so the agent persists across reboots without a VNC/GUI login. In the end, twenty minutes of work, end-to-end.

The lesson is the one every executive should take home: agent uptime is a Scrum impediment. Therefore, tokens, gateways, persistence, and login state now belong on the impediment list, not in a separate ticket queue. Otherwise, if your governance model still treats them as IT plumbing, your agent team will spend its day blocked.

Velocity With Verification

Today’s numbers, taken straight from Mission Control:

  • 58 stories completed today
  • 121 story points completed
  • 65.33 stories per day on the three-day moving average
  • 10 stories closed by recovering from AI-review rejection — pure rework

Notably, value is not only new feature work. It also includes rework recovered after AI review, security and compliance fixes, infrastructure reliability, revenue and marketplace enablement, and content and distribution. Above all, all of it ships through the same review gate.

The rule that holds the whole thing together: velocity without verification is hallucination. Done means evidence.

What Changes in Scrum When Agents Join the Team

The framework does not bend. However, six things shift in Scrum for AI agent teams:

  • The Product Owner still sets value and order — now human plus agent-assisted.
  • The Scrum Master becomes a flow debugger and a system-health optimizer.
  • Developers include human developers and specialized AI agents working as one team.
  • The Definition of Done must include evidence, tests, artifact paths, logs, and review gates.
  • The sprint board becomes an execution control system, not a status report.
  • Impediments include token mismatches, stale gateways, model regressions, context loss, and failed Slack routes.

Read that list to a leadership team and watch what happens. The transformation is not “replacing teams with agents.” Rather, it is redesigning the operating model so humans and agents produce measurable outcomes together.

Three Executive Lessons

First, strategy must be executable by agents. Vague work stalls or hallucinates. Therefore, backlog items become machine-executable contracts — explicit, testable, ordered by value.

Second, governance must be real-time. Quarterly steering committees are too slow. Instead, daily health-checks, audit logs, WIP limits, and review gates are required.

Third, advantage moves to learning rate. Whoever upgrades, verifies, and redeploys fastest learns fastest. As a result, Scrum becomes the learning loop for human-plus-AI systems.

Executives who fund only models will be outpaced by executives who fund the operating model.

The Pattern Others Can Copy

If you want to start tomorrow, here is the checklist for Scrum for AI agent teams:

  • Backlog ordered by value
  • WIP limit enforced — start at 1
  • Agents self-select top eligible work
  • Evidence-based Definition of Done
  • Automated review gate (Grok / PO)
  • Health checks and recovery scripts
  • Daily value report
  • Audit trail end-to-end

To begin with, start with one agent team and one visible board. Next, define WIP = 1. Then require evidence for “done.” After that, add automated review. Also add health checks before work starts. Meanwhile, track value daily. Above all, improve the system every morning.

The Closing Line

In conclusion, AI agents do not eliminate Scrum. On the contrary, they make disciplined Scrum more important.

When the machine can think, leaders must design the system that tells it what value means, how work flows, and how “done” is proven. That system has a name. Indeed, it has been carrying complex work since 1993, and it scales just as cleanly across humans, agents, and the next thing after agents. After all, Scrum was always a protocol for friction-reduction in complex adaptive systems. The complex adaptive system simply acquired a new kind of participant — and it is here to stay.


📥 Download the Keynote Deck

The full slide deck — “When the Machine Can Think: Scrum for AI Agent Teams” — is available here:

Download the presentation

🇨🇷 Join the Live Keynote

Dr. Jeff Sutherland will deliver this keynote online to the Agile Forum Costa Rica 2026 — Rethinking Organizational Transformation.