Slide 1 — Operating a Design Agent Team
Welcome to the final module of Orchestrating Design Agent Teams. The previous five modules gave you the patterns: when to split work, how to run fan-outs and pipelines, how to chain tools, how to share a canvas, how to build a pipeline to production. This module is about keeping all of that running once it belongs to a team rather than to you. That means cost you can see and budget, permissions and audit trails for changes agents make, review capacity that scales with the volume agents produce, and a way to onboard people without a six-week ramp. None of it is glamorous. All of it decides whether the capability lasts.
Slide 2 — Cost: three metering models, one decision
Start with cost, because it is the first thing leadership asks about and the last thing most teams can answer. As of June 2026, the major vendors have converged on a similar ladder: roughly twenty dollars for entry, around a hundred for daily use, around two hundred for all-day use — those numbers will drift, so verify them on the official pages. What matters more is how usage is metered. Subscription windows are predictable but stall on heavy days. Credit metering never stalls but surprises you on the bill. Bring-your-own-key tracks spend exactly, as long as someone owns the tracking. Orchestration multiplies all of this — fan-outs and critique agents are exactly what the bigger tiers exist for. Pick the failure mode you can manage.
Slide 3 — From one opaque bill to per-workflow budgets
Here is the practical move: budget per workflow, not per subscription. The monthly design system audit, the concept explorations, the visual QA sweeps, the ops report — each one gets a rough budget, an owner, and an honest comparison against the manual work it replaces. That comparison is the keep-or-cut signal. If the audit's findings are still sitting unactioned next cycle, or the report nobody reads costs real money and review time, cut it or redesign it — that is operations working, not the approach failing. And keep a capped pool for exploratory runs, with one rule: anything you repeat twice becomes a named workflow with a budget of its own.
Slide 4 — Permissions: who and what may change which surfaces
Next, permissions. The question is not whether you trust the agent — it is which surfaces have a high blast radius, and who or what may change them. Token files, shared components, anything that publishes to users: those propagate everywhere, so they default to deny and go through gates. Scratch branches and draft canvases stay open, because that is where speed lives. Scope MCP credentials to the least access that works — read-only is enough more often than you would think. And write the map down, versioned next to your harness. A permission scheme that lives in someone's head is not an operating model; it is a single point of failure.
Slide 5 — Audit trails: what changed, why, and on whose approval
Audit trails answer one question: where did this change come from? With agent volume, you cannot answer that from memory. So every agent-driven change lands as a diff — a branch, a pull request, a canvas snapshot — never a silent edit. The change references the brief or the workflow that produced it, so a reviewer judges it against intent. The approver is named, because checks can be automated but accountability cannot. And reports follow the same rule: every number links to the script or export that computed it. Months later, when someone asks why the spacing token changed, you have a boring, checkable answer. Boring is the goal.
Slide 6 — The operating model in one picture
Here is the whole operating model in one picture. Four layers. Harness ownership: the instruction files, skills, and saved workflows, versioned and owned by a named person. Budgets and cost: tracked per workflow, owned by whoever signs the invoice. Review gates: scoped permissions, audit trails, and a named approver on every change. Reporting: agents assemble the evidence, every figure links to its source, and a human signs it. Underneath sits the cadence — weekly spend checks, per-change gates, a signed report each cycle, and a quarterly review of the harness and the policies themselves. Two tests for whether this is real: every layer has a named owner, and every layer has a slot in the cadence. Miss either and it is decoration.
Slide 7 — Review capacity: the bottleneck moves to human attention
Now the constraint that sneaks up on every team: review. Orchestration multiplies what gets produced, but it does not multiply the people qualified to judge it. So plan review like a resource. Estimate the load when you plan the run — a four-worker fan-out is four reviews and a merge, not one. Tier it: automated checks on everything, critic agents against explicit criteria, and human judgment only on what survives. Name the reviewer before the run starts. And watch for the two failure modes: rubber-stamping when the queue grows, and adding more agents when the real constraint is the reviewer. Unreviewed output is not capacity — it is risk you have postponed.
Slide 8 — Onboarding without the six-week ramp
Onboarding is where you find out whether you built a system or a pile of personal setups. If the harness carries the team's knowledge — instruction files, skills, saved workflows, conventions — a new designer gets most of it by cloning the repo and reading three things: the operating agreement, the permission map, and one traced run. Start them on low-blast-radius work: audits, drafts, scratch branches. Review their briefs before their output, because briefing quality is the leading indicator. And expand their access surface by surface, based on demonstrated judgment, not tenure. If onboarding still takes six weeks, the problem is not the new person — it is that the knowledge never made it into the harness.
Slide 9 — Governance that enables vs governance that strangles
Governance gets a bad name because most of it is written to prevent things rather than to enable them safely. The pattern that works: govern surfaces and workflows, not individual runs. Pre-approved workflows run freely; new tools and new surfaces need a decision once. Scoped access with gates beats blanket bans, because bans just create workarounds you cannot see. Per-workflow budgets with a keep-or-cut review beat spending freezes. And one line worth writing into both the policy and the agents themselves: reporting describes team-level distributions, never individuals. The moment the numbers become a surveillance tool, people optimise for the report instead of the work — and the whole system loses the trust it runs on.
Slide 10 — Failure review: when an agent-driven change goes wrong
Eventually something gets through — a token change that breaks a product surface, a page published before it should have been. What matters is the response. Contain first: revert the diff, roll back the snapshot. That is why everything lands as a reviewable change. Then reconstruct from the audit trail: which brief, which run, who approved it. Then diagnose properly. The agent made a mistake is a description, not a diagnosis — the real question is which layer let it through: the brief, the harness, a missing check, a saturated reviewer. Fix that layer, encode the fix in the harness, and keep the workflow running unless the failure tells you the surface should never have been automated. Both answers are allowed. Choose deliberately.
Slide 11 — Worked example: an operations review of a running team
Let's see the operating model working, using the documented case from the design ops reporting workflow: five squads, six repos, a platform-design pair, and a quarterly review. The reporting layer showed token coverage rising from sixty-one to seventy-eight percent over two quarters, and because every figure linked to its counting script, the leads argued about priorities, not about the numbers. The cost layer was boring in the best way: forty-minute runs, a day of manual assembly down to two hours, one unread report variant cut. The interesting finding was review capacity — the audit was producing findings faster than squads could action them, so the team slowed it down rather than speeding it up. And the backlog signal, nine stale component requests from the lowest-coverage squads, was raised as a question. The follow-up conversation found the answer. That is operations doing its job.
Slide 12 — Exercise: draft your one-page operating agreement
Your turn. Draft the one-page operating agreement for your team. List the recurring workflows, each with an owner and a rough budget. Name the gated surfaces — tokens, shared components, anything that publishes — and who approves changes to each. For every workflow, name the reviewer and the checks that run first. Write down what gets reported, on what cadence, including the line that says no per-person metrics. And finish with the failure steps and the date of your first quarterly review. Keep it to one page, write it with the people who will live under it, and be honest about what you will actually do — a modest agreement that is followed beats an impressive one that is ignored.
Slide 13 — Summary, and where to go next
Let's close the course. Operating a design agent team comes down to four layers and a cadence. Cost managed per workflow, with budgets, owners, and the honesty to cut runs that are not worth it. Permissions and audit trails that make every agent-driven change explainable. Review capacity treated as the real constraint, because it is. And a harness, an operating agreement, and a reporting cycle that are written down, owned, and revisited quarterly — since the tools and prices underneath all of this keep moving. The patterns from the earlier modules are what make orchestration powerful. The operating layer is what makes it durable. Thanks for taking the course — now go put the first quarterly review in the calendar.