AAgentic Design School
Module 1 of 6
35–45 minutes

Orchestrating Design Agent Teams

When One Agent Is Not Enough

The decision this whole course hinges on: multi-agent setups carry a real coordination tax, and most tasks do not repay it. This module covers the signals that a task genuinely splits, the costs people underestimate, and the single-agent alternatives to try first.

Duration35–45 minutes

Slides12 slides with notes and narration

Learning objectives

  • Name the coordination costs of multi-agent work: merge conflicts, divergent assumptions, review load.
  • Identify the task properties that justify splitting: independent zones, parallel deadlines, distinct expertise.
  • Exhaust the single-agent alternatives — better harness, longer runs, sequenced sessions — before splitting.
Slide deck

Work through the module

Each slide is shown in its 16:9 frame, exactly as it appears in the video version. Open the notes under any slide for the longer explanation, and the narration if you prefer to read along.

Slide 1 of 1216:9

When One Agent Is Not Enough

Orchestrating Design Agent Teams · Module 1 of 6

  • More agents is a cost, not a feature
  • The coordination tax: merging, drift, and multiplied review
  • The signals that a task genuinely splits — and the signals it does not
  • Single-agent alternatives to exhaust first, and a decision map you can reuse

This module exists to stop you splitting work that should stay together. Everything after it assumes the split was worth making.

Slide notes

This is an advanced course, and the audience already runs single-agent workflows comfortably. The temptation at that point is to treat multiple agents as the obvious next level — more agents, more output, more leverage. The first module pushes back on that framing on purpose: the rest of the course teaches orchestration patterns, MCP chains, concurrent canvases, and pipelines, and all of it is wasted effort if the underlying task should have stayed with one agent.

Set the expectation that this module is a decision module, not a tooling module. There are no agent-team configurations here, no task lists, no worktrees — those arrive in Module 2 and beyond. What this module gives you is the test: three questions, an honest cost ledger drawn from runs executed for the school's own articles, and a worked example where the same task is evaluated both ways.

It helps to say the conclusion up front so nobody spends the module waiting for the reveal: staying with one agent is the default, escalation is cheap because the artifacts are just files, and every exit on the single-agent side of the decision map is a good outcome. The course is not selling multi-agent work. It is teaching you to run it well on the minority of tasks that earn it.

Narration for this slide

Welcome to Orchestrating Design Agent Teams. Before we talk about patterns, pipelines, or shared canvases, we need to settle the question the whole course hinges on: when is one agent genuinely not enough? The honest answer is — less often than you would think. Multi-agent setups carry a real coordination tax, and most design tasks do not repay it. So this first module is a decision module. We will name the costs people underestimate, identify the signals that a task actually splits, and walk the single-agent alternatives you should exhaust first. If you finish this module deciding to keep most of your work with one agent, that is the module working.

Slide 2 of 1216:9

The coordination tax

Splitting work across agents does not just multiply the output. It multiplies the work around the output.

  • Orchestration setup: briefs, shared constraints, and gates written before any work starts
  • Duplicated context: every worker re-reads the design system and the relevant files
  • Divergent assumptions: parallel workers cannot see each other's decisions
  • Multiplied review: a merge step that cannot be skipped, plus debugging across separate histories

The bill for coordination arrives whether or not the split helped. That is what makes splitting too early so expensive.

Slide notes

Walk the four bullets as a ledger, because each one is a cost that exists only because the work was split. Orchestration setup is the most visible: in the five-surface run executed for the school's orchestration article, writing the parent contract, the shared constraints, and five worker briefs took about 22 minutes before any worker produced anything. Duplicated context is quieter — every worker reads the design system, the constraints, and its slice of the project from scratch, which is most of why token use multiplies.

Divergent assumptions are the failure class that defines multi-agent work. Two workers each make a locally reasonable decision — one marks recently reviewed content with a green pill, the other with a yellow badge — and the outputs conflict globally. Cognition's widely cited 'Don't Build Multi-Agents' essay makes exactly this argument for write-heavy work: parallel workers that cannot see each other's reasoning will make conflicting assumptions. The school's own traced runs reproduced it on the first attempt, deliberately and then not so deliberately.

Multiplied review is the cost teams underestimate most. The merge is a real design review — reading every output, classifying conflicts, resolving them against the design system, recording what was rejected — and when something is wrong in the merged result, the cause can live in any worker's history, in the brief that shaped it, or in the constraint nobody wrote. None of this is an argument against splitting. It is the price list you check before you do.

Narration for this slide

Let's start with the costs, because they are the part people skip. When you split work across agents, you do not just multiply the output — you multiply the work around the output. You write briefs and constraints before anything starts. Every worker re-reads the design system and the project from scratch, which is where the token bill comes from. Workers that cannot see each other's decisions make conflicting assumptions — that is not a risk, it is the default. And at the end there is a merge review you cannot skip, plus debugging that now spans separate histories. Here is the uncomfortable part: that bill arrives whether or not the split helped. So before you pay it, you check the price list.

Slide 3 of 1216:9

Signals a task genuinely splits

The decision is not based on how big the task sounds. It is based on boundary quality.

  • Independent zones: pages, bands, dashboard panels, canvas regions with no shared files
  • Independent stages: research, spec, implementation, QA — each output feeds the next
  • Distinct expertise: a builder, a read-only reviewer, a QA pass with its own focus
  • Ownership you can state in one sentence per worker, with explicit exclusions
  • Parallel deadlines where wall-clock time on separable surfaces actually matters

The clearer the boundary, the safer the delegation. The blurrier the boundary, the more the merge destroys the benefit.

Slide notes

The test that does the most work here is the one-sentence boundary. If you cannot describe what a worker owns — and what it must not touch — in one sentence, you do not have a decomposition; you have a wish. A full dashboard with navigation, a queue table, filters, charts, and mobile QA usually passes the test, because each of those is a place or a stage with its own files. A landing-page hero refinement usually fails it, because everything in it depends on everything else.

The three shapes of split worth naming now, ahead of Module 2, are spatial, by stage, and by expertise. Spatial splits — pages, bands, regions — parallelise well and fail at the seams. Stage splits — spec, implementation, QA — barely parallelise but merge almost for free, because each output feeds the next. Expertise splits work best when most of the specialist roles are read-only reviewers rather than writers; a visual-QA worker that reads everything and edits nothing is the cheapest worker you will ever add.

The parallel-deadline signal needs the honest caveat: parallelism only buys wall-clock time when the workers genuinely run at the same time and the surfaces are genuinely separable. A split that runs sequentially because the environment or the dependencies force it to has all of the coordination cost and none of the speed.

Narration for this slide

So when does a task genuinely split? Not when it sounds big — when it has real boundaries. Independent zones: pages, bands, dashboard panels, canvas regions that do not share files. Independent stages: research feeding a spec, a spec feeding implementation, implementation feeding QA. Distinct expertise: a builder, plus a read-only reviewer that checks the work without touching it. The test that settles most cases is simple: can you state each worker's ownership in one sentence, including what it must not touch? If you can, delegation is safe. If you cannot, the merge will eat whatever time the split saved. Boundary quality is the whole decision.

Slide 4 of 1216:9

Signals it does not split

Each of these looks like a candidate for more agents and is actually an argument for one.

  • Tightly coupled taste: a polish pass where spacing, copy tone, and hierarchy are tuned against each other
  • One shared file or component: two writers on the same surface turns the merge into a rescue
  • Unclear ownership: if you cannot name who decides, splitting just multiplies the people not deciding
  • Small tasks: the orchestration setup alone can exceed the whole single-agent budget
  • Sequential dependencies: stages that wait on each other gain coordination cost without parallel speed

Polish is the canonical case: everything overlaps with everything, so there is no clean boundary to give a worker.

Slide notes

The polish pass deserves the most time because it is the most tempting and the worst candidate. A product detail page needing final refinement before a stakeholder review is mostly hierarchy, spacing rhythm, copy tone, and a few mobile adjustments — decisions that only make sense relative to each other. Split that across three workers and the merged result frequently feels less coherent than the starting point, because each worker optimised local details and weakened the whole rhythm. One agent should hold that taste loop, full stop.

The shared-file signal is mechanical rather than aesthetic: two agents editing the same component, the same token file, or the same frame produce a merge that is harder than the original task. The platform vendors say the same thing in their own documentation — as of June 2026, Anthropic's Agent Teams guidance explicitly recommends a single session or subagents for sequential and same-file work, and file-level ownership per teammate when you do split. When the vendor selling the multi-agent surface tells you when not to use it, believe them.

The small-task signal is just arithmetic. In the school's traced runs, briefs and constraints took nine minutes at two-worker scale and twenty-two at five-worker scale; for a single-surface change, that setup is most of the total budget. And unclear ownership is the organisational version of the shared-file problem: splitting a task nobody owns does not create owners, it creates more outputs nobody owns.

Narration for this slide

Now the other side — the signals that a task should stay together, even when more agents feel tempting. Tightly coupled taste work is the big one: a polish pass where spacing, tone, and hierarchy are tuned against each other has no clean boundary to hand a worker, and a split version usually comes back less coherent than it started. One shared file or component means two writers on the same surface, and that merge becomes a rescue. If you cannot say who owns a decision, more agents just means more outputs nobody owns. Small tasks cannot repay the setup time. And stages that wait on each other pay the coordination cost without getting any parallel speed back.

Slide 5 of 1216:9

Single-agent alternatives to exhaust first

Most of what people reach for multiple agents to fix is fixable inside one session.

  • A better harness: design rules, component names, and anti-patterns the agent reads every run
  • Longer, planned runs: a reviewed plan up front lets one agent work a large surface in stages
  • Sequenced sessions: one agent, several sittings, each picking up the previous session's artifacts
  • A read-only critique pass: the same agent (or a cheap second one) reviewing against written criteria
  • Escalate later: the briefs, constraints, and logs you would need are just files — nothing is locked in

Staying with one agent is the default, not a consolation prize. Escalation stays cheap precisely because the artifacts are files.

Slide notes

Most of the dissatisfaction that drives people to multi-agent setups is actually harness debt. Inconsistent output, invented colours, ignored conventions — none of those are fixed by adding workers, because every new worker inherits the same missing rules. Tightening the harness file, naming the components to reuse, and writing the anti-pattern list raises the floor for one agent and for any future team equally, so it is never wasted work.

Longer runs and sequenced sessions cover the 'the task is too big for one sitting' worry. A single agent with an approved plan can work a large surface in stages, and a sequence of sessions — each starting from the previous session's artifacts and notes — keeps one continuous line of design decisions without any merge step at all. The single-agent baseline in the school's five-surface run covered all four surfaces in about 28 minutes in one pass; what it lacked was not coverage but an independent reviewer and a written decision record.

That last gap is the honest one, and the read-only critique pass addresses most of it: a second pass over the output against written criteria, by the same agent or a cheap separate one, is not orchestration — it is a crit. Keep the escalation point explicit: if the task later grows real boundaries, the parent brief, worker briefs, and merge log you would need are ordinary files. Starting single-agent locks you out of nothing.

Narration for this slide

Before you split anything, exhaust the single-agent alternatives, because most of them fix the actual problem. If the output is inconsistent, that is harness debt — tighten the design rules and the anti-pattern list, and every future run improves. If the task feels too big, run it longer with a reviewed plan, or run it across sequenced sessions where each sitting picks up the last one's artifacts. If what you are missing is a second opinion, add a read-only critique pass against written criteria — that is a crit, not an orchestra. And remember the escape hatch: if the task later grows real boundaries, escalating is cheap, because everything you would need is just files.

Slide 6 of 1216:9

The decision map

Three questions, asked before any worker exists. Any failed answer routes to the single-agent default.

Decision diagram with a central gate panel listing three questions: can each worker's ownership boundary be stated in one sentence, do the slices avoid sharing files, components, or one taste loop, and is the task valuable enough to pay the coordination and token cost. Any answer of no leads left to the default of one well-harnessed agent, suited to coupled taste work, single surfaces, and small tasks, at the cost of longer runs, sequenced sessions, and no independent reviewer. All three answers of yes lead right to splitting across agents, suited to independent zones, parallel deadlines, and distinct expertise, at the cost of orchestration setup, duplicated context, multiplied tokens, and a mandatory merge review.
Both exits are good outcomes, and both have costs. The single-agent path trades parallelism and independent review for simplicity; the multi-agent path trades setup, tokens, and a mandatory merge for separable speed and surfaced conflicts.

Most design tasks fail at least one of the three questions — and that is the map working, not the map being pessimistic.

Slide notes

Walk the gate panel first, because the order of the questions matters less than the fact that they are asked before any worker exists. Boundaries: one sentence of ownership per worker, including exclusions. Separation: no shared files, no shared components, and no single taste loop stretched across workers. Value: the task has to be worth the coordination and token cost plus a merge review that cannot be skipped. Gates invented after the outputs arrive have a way of bending around whatever came back, which is why the map insists on asking first.

Then make the symmetry explicit: both exits are outcomes with costs, not a good path and a bad path. The single-agent side gives up parallelism, gives up an independent reviewer, and asks one context to carry every decision — which is fine for most tasks and exactly wrong for a handful. The multi-agent side buys separable speed, isolation, and surfaced conflicts, and pays in setup, duplicated context, multiplied tokens, and the merge.

The line under the gate panel — most design tasks fail at least one question — is the empirical claim worth defending out loud. Component tweaks, single-page refinements, token cleanups, copy passes, and polish rounds make up most of a working week, and none of them pass all three. The tasks that do pass — design-system audits at scale, multi-surface redesigns, multi-direction concept exploration — are exactly the workflows the rest of this course is built around.

Narration for this slide

Here is the whole module on one diagram. Three questions sit in the middle, and you ask them before any worker exists. Can you state each worker's ownership in one sentence? Do the slices avoid sharing files, components, or one taste loop? And is the task valuable enough to pay the coordination cost plus a merge review you cannot skip? Any no routes you left, to one well-harnessed agent — the default. All three yeses route you right, to a genuine split. Notice that both sides list costs. The single-agent path gives up parallelism and an independent reviewer. The multi-agent path pays in setup, tokens, and the merge. Most tasks fail at least one question, and that is the map doing its job.

Slide 7 of 1216:9

Honest cost accounting from real runs

Two runs executed for this school's articles, plus the published figures — labelled for what they are, as of June 2026.

Two-worker spec runFive-surface orchestrated runSingle-agent baseline (same brief)
Wall clock~25 min~77 min sequential (~46 min if parallel, estimate)~28 min, one pass
Setup before work~9 min of briefs~22 min of contract, constraints, briefs~3 min
Conflicts and gaps caught1 conflict, 1 unowned gap2 conflicts, 2 unowned gaps, 5 QA findings0 — nothing surfaced, nothing recorded
Token cost~2x the text written (estimate)~3.5–4x baseline (estimate)~1x
Speed verdictDid not pay for itself as a one-offSlower as run; faster only if truly parallelFastest — and least defensible afterwards

At small scale the split buys evidence, decision records, and surfaced conflicts — not speed. Anthropic reports ~15x tokens for multi-agent research systems; treat that as an order-of-magnitude warning.

Slide notes

These numbers come from runs executed for the school's two multi-agent articles, and their honesty matters more than their precision. The two-worker run produced two specification documents for a field-notes section: about 25 minutes of wall clock, nine of them spent on briefs, one deliberate conflict surfaced and resolved at merge, and roughly twice the text written compared with what one agent would have needed. As a one-off it did not pay for itself in speed; what it bought was the conflict arriving as two comparable proposals, a written decision record, and reusable briefs.

The five-surface run makes the same point at larger scale. Orchestrated: about 77 minutes as executed, with the worker passes running sequentially because the environment could not spawn parallel workers — roughly 46 minutes is the estimate had they run in parallel. The single-agent baseline on the same brief: about 28 minutes, one pass, perfectly usable, no conflicts surfaced because one context decided everything implicitly, and no record of why. The tax bought two resolved conflicts, an independent QA pass, explicit state coverage, and a merge log.

The external figures point the same direction more sharply. Anthropic's engineering write-up on its multi-agent research system reports roughly fifteen times the tokens of a single chat interaction — a research workload, not a design benchmark, so quote it for scale only. Claude Code's cost guidance adds that team token use scales with team size and idle teammates keep consuming until shut down. Date-stamp all of this as June 2026 figures and re-check before quoting them to a finance conversation.

Narration for this slide

Let's put numbers on the tax, from runs executed for this school's own articles. A two-worker spec run took about twenty-five minutes, nine of them writing briefs, and roughly doubled the text written — as a one-off, it did not pay for itself in speed. A five-surface orchestrated run took about seventy-seven minutes as executed, against a twenty-eight minute single-agent baseline on the same brief. So what did the extra cost buy? Not volume — the baseline covered everything. It bought two conflicts surfaced and resolved before implementation, an independent QA pass, and a written record of what was rejected and why. And the published figures point the same way: Anthropic reports multi-agent research systems using around fifteen times the tokens of a single chat. The tax is real. Pay it on purpose or not at all.

Slide 8 of 1216:9

What the platforms themselves recommend

The vendors selling multi-agent surfaces publish the same caution this module teaches. As verified June 2026.

  • Claude Code: subagents are stable bounded workers; Agent Teams are experimental and opt-in
  • Anthropic's own guidance: sequential and same-file work belongs in a single session or subagents
  • File-level ownership per teammate; sizing guidance of 3–5 focused workers, not a swarm
  • Codex, OpenCode, and OpenPencil express the same operating model: bounded workers, owned outputs, a merge
  • The published counter-position (Cognition): write-heavy, coupled work does better single-threaded

When the vendor of the multi-agent surface tells you when not to use it, that is not marketing caution — it is the same separability test.

Slide notes

The point of this slide is not a tool survey — Module 2 and the platform article carry that — it is that the decision framework in this module is not contrarian. As of June 2026, Anthropic ships two distinct surfaces in Claude Code: subagents, which are stable bounded workers with their own context that report back to the calling session, and Agent Teams, which are experimental, disabled by default, and switched on with an environment flag. The Agent Teams documentation itself recommends a single session or subagents for sequential and same-file work, file-level ownership per teammate, and a team size of three to five workers — with the memorable line that three focused teammates often outperform five scattered ones.

The other platforms express the same operating model with different machinery. Codex offers subagent workflows, worktree-isolated parallel sessions, and hosted cloud tasks. OpenCode makes agent roles project configuration, with per-agent permissions for editing, shell access, and which subagents may be invoked. OpenPencil applies the idea to a shared canvas, with concurrent agents owning regions. In every case the durable parts are the same: a shared contract, bounded workers, owned outputs, and a merge step nobody automates away.

Name the counter-position too, because students will encounter it: Cognition's 'Don't Build Multi-Agents' essay argues that for write-heavy work where every decision depends on its neighbours, parallel workers make conflicting assumptions and a single-threaded agent with full context simply does better. LangChain's follow-up reconciles the positions around the same separability test this module teaches. Date-stamp all of it: these surfaces are moving quickly, and the experimental ones change shape between quarters.

Narration for this slide

Here is something worth noticing: the vendors selling multi-agent surfaces publish the same caution this module does. As of June 2026, Anthropic's own Agent Teams documentation says sequential and same-file work belongs in a single session or with subagents, recommends file-level ownership per teammate, and suggests three to five focused workers rather than a swarm. Codex, OpenCode, and OpenPencil all express the same operating model — bounded workers, owned outputs, and a merge. And the strongest published counter-position, from Cognition, argues that write-heavy coupled work does better with one agent holding the full context. The decision map you just saw is not contrarian. It is the consensus, written down for design work.

Slide 9 of 1216:9

Worked example: one task, evaluated both ways

A real task from the school's own runs: specify a field-notes section — a repeated card plus the section frame around it.

Kept with one agentSplit across two workers
BoundariesOne context holds card and frame togetherCard spec and frame spec, one sentence of ownership each
Shared surfacesNone to worry about — one writerOne deliberate overlap: how a recently reviewed note is marked
What came backOne coherent spec, freshness rule decided implicitlyTwo specs; a yellow badge vs a green stripe — a real conflict
Cost~10–15 min, one pass~25 min, ~2x the text, plus an 8-minute merge review
What the extra boughtNothing recorded; nothing surfacedThe conflict on the record, a merge log, reusable briefs

Both evaluations are defensible. The split was slower and produced evidence; the single run was faster and produced no record. The decision map tells you which trade this task deserves.

Slide notes

This is the two-worker trace from the school's article on when to use multiple design agents, and its value is precisely that it sits on the boundary of the decision map. The task — specify a field-note card and the section frame that holds it — passes the boundary question, just barely passes the separation question, and arguably fails the value question for a one-off. Running it both ways makes the trade visible instead of theoretical.

Kept with one agent, the work takes ten to fifteen minutes and comes back coherent, because one context holds both surfaces and decides the freshness treatment implicitly along the way. Split across two workers, the briefs took nine minutes, each worker produced its spec in about four, and the merge review took eight more. The deliberate overlap did exactly what it was designed to do: Worker 1 marked recently reviewed notes with a school-yellow badge in the card's meta row; Worker 2 switched the card's accent stripe to green and added an 'Updated this month' line — reaching across the ownership boundary into the other worker's surface. The merge resolved it against the design system, which reserves green for workflow cues, and recorded the rejected option.

The lesson to land is not that one evaluation is right. It is that the split's value was the evidence — the conflict surfaced as two comparable proposals, the decision recorded, the briefs reusable next time these surfaces change — and the single run's value was speed. For a throwaway spec, take the speed. For a surface several people will touch next, the record may be worth twenty-five minutes. The decision map is how you make that call before the work, not after.

Narration for this slide

Let's evaluate one real task both ways. The task: specify a field-notes section — a repeated card, plus the frame around it. Kept with one agent, it takes ten to fifteen minutes and comes back coherent, because one context decides everything, including how a recently reviewed note is marked. Split across two workers, it took about twenty-five minutes, twice the text, and an eight-minute merge — and the deliberate overlap produced a real conflict: a yellow badge from one worker, a green stripe from the other. The merge resolved it against the design system and wrote the reason down. Both evaluations are defensible. The split bought evidence and a decision record; the single run bought speed. The map tells you which trade this task deserves.

Slide 10 of 1216:9

If you do split: the artifacts that make it safe

Escalation is cheap because everything a split needs is a plain file. These are the files.

  • An orchestration brief: the user job, one-sentence ownership per worker, named overlaps, merge gates
  • A shared constraints file every brief points at — and no brief restates
  • One brief and one owned output file per worker, with explicit exclusions
  • A merge log: what was accepted, what conflicted, what was rejected and why, what nobody owned
  • A single-agent baseline or prior estimate, so the split can be judged against something

These artifacts are portable across every platform — and they are exactly what you hand back to one agent if the split stops paying.

Slide notes

This slide previews Module 2 deliberately, because the artifacts are also part of the decision: if your decomposition cannot be expressed as one owner per file with a one-sentence boundary, that is the map telling you the task is not separable yet. The orchestration brief is the parent contract — user job stated once, ownership boundaries, the overlaps you expect the merge to resolve, and the gates written before any output exists. The shared constraints file matters at team scale because pasting constraints into each brief is how drift starts: one brief gets edited, the others do not, and workers follow different rules without anyone deciding that.

The merge log is the artifact people skip and the one with the longest life. It records not just what was accepted but what was rejected and why — the next agent or human who wonders why recently reviewed notes are not green reads the answer instead of relitigating it. The gaps section matters too: decisions no brief assigned are decisions nobody made, and the honest move is to log them as open questions rather than have the orchestrator invent answers during the merge.

The baseline is the quiet discipline: without a single-agent run or at least a prior estimate, nobody can say afterwards whether the orchestration bought anything, and the benefit gets asserted instead of measured. And the fallback path runs through the same files — if workers keep colliding mid-run, hand one agent the brief, the constraints, the surviving outputs, and the merge log so far, and let it finish in one context. The artifacts make both directions cheap.

Narration for this slide

If the map does route you to a split, the thing that makes it safe is a small set of plain files. An orchestration brief that states the user job once, gives each worker a one-sentence boundary, names the overlaps you expect, and writes the merge gates before any work exists. A shared constraints file every brief points at. One brief and one owned output file per worker. A merge log that records what was accepted, what conflicted, what was rejected and why, and what nobody owned. And a baseline, so you can tell afterwards whether the split bought anything. These files are portable across every platform — and if the split stops paying mid-run, they are exactly what you hand back to one agent to finish the job.

Slide 11 of 1216:9

Exercise: classify three of your current tasks

Take three real tasks from your current backlog and run each one through the decision map. On paper — do not run any agents yet.

  • Pick three tasks of different sizes: one small, one medium, one that feels too big for a single sitting
  • For each, answer the three questions: one-sentence boundaries, no shared files or taste loop, worth the cost
  • For any task that fails, name the single-agent alternative you would try first
  • For any task that passes, write the one-sentence ownership boundary for each worker you would create
  • Note the deliberate overlap you would name in advance, and the conflict you expect it to produce

Keep the page. Module 2 turns the task that passed into a real decomposition with a pattern, briefs, and merge rules.

Slide notes

The exercise is deliberately analogue: three real tasks, one page, no agents. The spread across sizes matters, because the most common discovery is that the small and medium tasks fail the value question immediately, and the big one fails the boundary question on the first attempt — the boundaries people write for it are wishes, not ownership statements. Rewriting them until each is genuinely one sentence with exclusions is most of the learning.

The single-agent alternative line keeps the exercise honest. For tasks that fail the map, the answer should be specific: tighten the harness, run it longer with a reviewed plan, sequence the sessions, or add a read-only critique pass — not just 'use one agent'. For the task that passes, the deliberate-overlap question is the one that separates people who have absorbed the module from people who have skimmed it: naming where two workers will both need a position, and predicting the conflict, is what makes the eventual merge an expected comparison rather than a surprise.

If running this live, collect the boundary sentences and read a few aloud. The difference between 'Worker 1 owns the dashboard' and 'Worker 1 owns the queue table's fields, density, and states, and does not touch navigation, filters, or tokens' is the difference between a decomposition and a wish, and hearing both versions back to back lands it faster than any slide.

Narration for this slide

Time to apply the map to your own backlog. Pick three real tasks: one small, one medium, and one that feels too big for a single sitting. Run each through the three questions — boundaries in one sentence, no shared files or taste loop, worth the coordination cost. For the ones that fail, and most will, name the single-agent alternative you would try first: a better harness, a longer planned run, sequenced sessions, or a critique pass. For the one that passes, write the one-sentence ownership boundary for each worker, and name the overlap you expect to cause a conflict. Keep the page. In Module 2 we turn that passing task into a real decomposition.

Slide 12 of 1216:9

Summary, and the bridge to orchestration patterns

  • More agents is a cost: setup, duplicated context, divergent assumptions, multiplied review
  • Tasks split when boundaries are real — independent zones, stages, or expertise, one sentence each
  • They do not split for coupled taste, shared files, unclear ownership, or small tasks
  • Exhaust the single-agent alternatives first: harness, longer runs, sequenced sessions, a critique pass
  • At small scale a split buys evidence and decision records, not speed — pay the tax on purpose

Module 2 assumes the split passed the map, and covers the three patterns that carry most real cases: fan-out, pipelines, and critic pairs.

Slide notes

Recap the module as a single decision discipline rather than five separate facts. The coordination tax explains why the default is one agent; the split signals and the do-not-split signals are the two halves of the boundary test; the single-agent alternatives are what you exhaust before paying the tax; and the cost accounting is what keeps the decision honest afterwards. If participants remember only the three questions on the decision map, the module has done its job.

Be explicit that nothing in this module argued against the rest of the course. The workflows the school publishes — design-system audits at scale, multi-direction concept exploration, canvas-to-production pipelines — are exactly the tasks that pass the map, and they are why the orchestration skill is worth building. The point of the gate is to protect those cases from being diluted by a habit of splitting everything.

Preview Module 2 concretely: it assumes the decision has been made and covers the three patterns that carry most real cases — fan-out for independent parallel zones with one merge point, pipelines for staged work where each output feeds the next, and critic pairs where one agent produces and another reviews against explicit criteria. It also introduces the orchestrator role properly: the brief, the boundaries, the merge log, and the stop conditions. The exercise page from this module is the input — the task that passed the map is the one participants will sketch a pattern for next.

Narration for this slide

Let's close the module. More agents is a cost: setup, duplicated context, divergent assumptions, and a merge review you cannot skip — and the bill arrives whether or not the split helped. Tasks split when the boundaries are real: independent zones, stages, or expertise, each ownable in one sentence. They do not split for coupled taste, shared files, unclear ownership, or small tasks. Exhaust the single-agent alternatives first, and when you do split, expect to buy evidence and decision records rather than speed — at least at small scale. In Module 2 we assume the split passed the map, and cover the three patterns that carry most real cases: fan-out, pipelines, and critic pairs. See you there.

Module transcript
Module 1, narrated slide by slide

Slide 1When One Agent Is Not Enough

Welcome to Orchestrating Design Agent Teams. Before we talk about patterns, pipelines, or shared canvases, we need to settle the question the whole course hinges on: when is one agent genuinely not enough? The honest answer is — less often than you would think. Multi-agent setups carry a real coordination tax, and most design tasks do not repay it. So this first module is a decision module. We will name the costs people underestimate, identify the signals that a task actually splits, and walk the single-agent alternatives you should exhaust first. If you finish this module deciding to keep most of your work with one agent, that is the module working.

Slide 2The coordination tax

Let's start with the costs, because they are the part people skip. When you split work across agents, you do not just multiply the output — you multiply the work around the output. You write briefs and constraints before anything starts. Every worker re-reads the design system and the project from scratch, which is where the token bill comes from. Workers that cannot see each other's decisions make conflicting assumptions — that is not a risk, it is the default. And at the end there is a merge review you cannot skip, plus debugging that now spans separate histories. Here is the uncomfortable part: that bill arrives whether or not the split helped. So before you pay it, you check the price list.

Slide 3Signals a task genuinely splits

So when does a task genuinely split? Not when it sounds big — when it has real boundaries. Independent zones: pages, bands, dashboard panels, canvas regions that do not share files. Independent stages: research feeding a spec, a spec feeding implementation, implementation feeding QA. Distinct expertise: a builder, plus a read-only reviewer that checks the work without touching it. The test that settles most cases is simple: can you state each worker's ownership in one sentence, including what it must not touch? If you can, delegation is safe. If you cannot, the merge will eat whatever time the split saved. Boundary quality is the whole decision.

Slide 4Signals it does not split

Now the other side — the signals that a task should stay together, even when more agents feel tempting. Tightly coupled taste work is the big one: a polish pass where spacing, tone, and hierarchy are tuned against each other has no clean boundary to hand a worker, and a split version usually comes back less coherent than it started. One shared file or component means two writers on the same surface, and that merge becomes a rescue. If you cannot say who owns a decision, more agents just means more outputs nobody owns. Small tasks cannot repay the setup time. And stages that wait on each other pay the coordination cost without getting any parallel speed back.

Slide 5Single-agent alternatives to exhaust first

Before you split anything, exhaust the single-agent alternatives, because most of them fix the actual problem. If the output is inconsistent, that is harness debt — tighten the design rules and the anti-pattern list, and every future run improves. If the task feels too big, run it longer with a reviewed plan, or run it across sequenced sessions where each sitting picks up the last one's artifacts. If what you are missing is a second opinion, add a read-only critique pass against written criteria — that is a crit, not an orchestra. And remember the escape hatch: if the task later grows real boundaries, escalating is cheap, because everything you would need is just files.

Slide 6The decision map

Here is the whole module on one diagram. Three questions sit in the middle, and you ask them before any worker exists. Can you state each worker's ownership in one sentence? Do the slices avoid sharing files, components, or one taste loop? And is the task valuable enough to pay the coordination cost plus a merge review you cannot skip? Any no routes you left, to one well-harnessed agent — the default. All three yeses route you right, to a genuine split. Notice that both sides list costs. The single-agent path gives up parallelism and an independent reviewer. The multi-agent path pays in setup, tokens, and the merge. Most tasks fail at least one question, and that is the map doing its job.

Slide 7Honest cost accounting from real runs

Let's put numbers on the tax, from runs executed for this school's own articles. A two-worker spec run took about twenty-five minutes, nine of them writing briefs, and roughly doubled the text written — as a one-off, it did not pay for itself in speed. A five-surface orchestrated run took about seventy-seven minutes as executed, against a twenty-eight minute single-agent baseline on the same brief. So what did the extra cost buy? Not volume — the baseline covered everything. It bought two conflicts surfaced and resolved before implementation, an independent QA pass, and a written record of what was rejected and why. And the published figures point the same way: Anthropic reports multi-agent research systems using around fifteen times the tokens of a single chat. The tax is real. Pay it on purpose or not at all.

Slide 8What the platforms themselves recommend

Here is something worth noticing: the vendors selling multi-agent surfaces publish the same caution this module does. As of June 2026, Anthropic's own Agent Teams documentation says sequential and same-file work belongs in a single session or with subagents, recommends file-level ownership per teammate, and suggests three to five focused workers rather than a swarm. Codex, OpenCode, and OpenPencil all express the same operating model — bounded workers, owned outputs, and a merge. And the strongest published counter-position, from Cognition, argues that write-heavy coupled work does better with one agent holding the full context. The decision map you just saw is not contrarian. It is the consensus, written down for design work.

Slide 9Worked example: one task, evaluated both ways

Let's evaluate one real task both ways. The task: specify a field-notes section — a repeated card, plus the frame around it. Kept with one agent, it takes ten to fifteen minutes and comes back coherent, because one context decides everything, including how a recently reviewed note is marked. Split across two workers, it took about twenty-five minutes, twice the text, and an eight-minute merge — and the deliberate overlap produced a real conflict: a yellow badge from one worker, a green stripe from the other. The merge resolved it against the design system and wrote the reason down. Both evaluations are defensible. The split bought evidence and a decision record; the single run bought speed. The map tells you which trade this task deserves.

Slide 10If you do split: the artifacts that make it safe

If the map does route you to a split, the thing that makes it safe is a small set of plain files. An orchestration brief that states the user job once, gives each worker a one-sentence boundary, names the overlaps you expect, and writes the merge gates before any work exists. A shared constraints file every brief points at. One brief and one owned output file per worker. A merge log that records what was accepted, what conflicted, what was rejected and why, and what nobody owned. And a baseline, so you can tell afterwards whether the split bought anything. These files are portable across every platform — and if the split stops paying mid-run, they are exactly what you hand back to one agent to finish the job.

Slide 11Exercise: classify three of your current tasks

Time to apply the map to your own backlog. Pick three real tasks: one small, one medium, and one that feels too big for a single sitting. Run each through the three questions — boundaries in one sentence, no shared files or taste loop, worth the coordination cost. For the ones that fail, and most will, name the single-agent alternative you would try first: a better harness, a longer planned run, sequenced sessions, or a critique pass. For the one that passes, write the one-sentence ownership boundary for each worker, and name the overlap you expect to cause a conflict. Keep the page. In Module 2 we turn that passing task into a real decomposition.

Slide 12Summary, and the bridge to orchestration patterns

Let's close the module. More agents is a cost: setup, duplicated context, divergent assumptions, and a merge review you cannot skip — and the bill arrives whether or not the split helped. Tasks split when the boundaries are real: independent zones, stages, or expertise, each ownable in one sentence. They do not split for coupled taste, shared files, unclear ownership, or small tasks. Exhaust the single-agent alternatives first, and when you do split, expect to buy evidence and decision records rather than speed — at least at small scale. In Module 2 we assume the split passed the map, and cover the three patterns that carry most real cases: fan-out, pipelines, and critic pairs. See you there.