AAgentic Design School
Module 2 of 6
40–50 minutes

Agentic Design Fundamentals

The Designer–Agent Loop

The loop from Module 1 slowed down to working speed: what each step looks like in a real session, how to treat the agent as a junior design partner rather than a vending machine, and where the loop most often breaks.

Duration40–50 minutes

Slides13 slides with notes and narration

Learning objectives

  • Walk each loop step — brief, generate, critique, revise, ship — as a concrete session activity.
  • Use plan mode and review gates so corrections happen before code exists.
  • Apply the show-early, iterate-small habit to keep agent runs reviewable.
  • Recognise the three most common loop failures and name the fix for each.
Slide deck

Work through the module

Each slide is shown in its 16:9 frame, exactly as it appears in the video version. Open the notes under any slide for the longer explanation, and the narration if you prefer to read along.

Slide 1 of 1316:9

The Designer–Agent Loop

Agentic Design Fundamentals · Module 2 of 6

  • The loop as a daily working rhythm, not a diagram on a slide
  • What each step actually produces in a real session
  • Plan mode and review gates as the platform-enforced version of the gates
  • Where the loop breaks, and the fix for each break

Module 1 named the loop. This module slows it down to the speed of an actual session — minutes, files, and decisions.

Slide notes

Module 1 ended with the loop diagram: brief, plan, review gate, generate, critique, revise, ship. This module takes that diagram and runs it at working speed. The risk with a diagram is that it stays a diagram — people nod at it and then go back to typing one-line requests. The goal here is to make each box concrete enough that participants can picture what they would actually be doing at 10:14 on a Tuesday.

Set the frame early: this is a rhythm, not a ceremony. A full loop on a bounded task is twenty to forty minutes, most of which is the agent working while the designer does something else. The two gates — the plan review and the critique — are minutes each. If the loop in someone's head feels like a process document with sign-offs, they have the wrong picture, and they will skip it the first time they are busy.

Flag what this module does not cover. It does not teach how to write the brief in detail — that is Module 3 — and it does not cover the harness that makes the loop repeatable across sessions, which is Module 4. Here the brief and harness are treated as inputs that exist; the focus is on what happens between them and the ship decision.

Narration for this slide

Welcome back. In Module 1 we drew the loop: brief, plan, review gate, generate, critique, revise, ship. In this module we slow it down to working speed. Not the diagram — the actual session. What you type, what comes back, how long each step takes, where you have to pay attention and where you can walk away. We will look at each step as a concrete activity, see how plan modes enforce the gates across the four platforms, trace one real session end to end with timestamps, and then look honestly at the three places the loop most often breaks. By the end, the loop should feel like a rhythm you could start tomorrow.

Slide 2 of 1316:9

The agent as a junior designer

The most useful mental model: a fast, literal design partner with no taste memory.

  • Fast — it can inspect files, generate variants, and run checks in minutes
  • Literal — it does what the words say, not what you meant by them
  • No taste memory — yesterday's corrections are gone unless they live in a file
  • Eager — it would rather produce something plausible than ask a question

You would not hand a junior a one-line request and judge them on the result. The agent deserves the same working agreement — and needs it more.

Slide notes

This mental model comes straight from the school's briefing article: treat the agent like a fast design partner with weak taste memory. Each property on the slide has a practical consequence. Fast means the cost of a wrong run is low — minutes, not days — which is why showing early and iterating small works. Literal means the gap between what you said and what you meant is the agent's whole world; it cannot read the product's politics, the reason the screen exists, or the level of polish a stakeholder review needs. No taste memory is the one designers underestimate most: the correction you made yesterday does not exist today unless it was written into the harness or the brief, so the same generic pattern will come back.

Eager is the property that makes the gates necessary rather than nice. Given a vague request, the agent fills the gaps with the most common pattern it has seen — oversized hero, rounded cards, marketing copy — and presents it confidently. It will rarely stop and ask whether that is what you wanted unless the brief explicitly makes the questions part of the deliverable.

The junior-designer framing also sets the right tone for the relationship. You do not abdicate to a junior, and you do not micromanage every keystroke either. You brief them, you review their plan, you critique their work against criteria you stated up front, and you decide what ships. That is the loop, described as management rather than as process.

Narration for this slide

Before we walk the steps, fix the mental model. The agent is best treated as a junior design partner: fast, literal, eager, and with no taste memory. Fast means a wrong run costs minutes, not days. Literal means it does what your words say, not what you meant — it cannot infer the politics or the polish level. No taste memory means yesterday's correction is gone unless it lives in a file. And eager means it would rather hand you something plausible than ask a clarifying question. You would never give a junior a one-line request and then judge them on the output. The agent needs the same working agreement, and the loop is that agreement.

Slide 3 of 1316:9

Step 1 — Brief: what the step produces

The brief is a short written artifact, not a long chat message. It carries what is specific to this task and points at the harness for everything durable.

  • Facts the agent needs: situation, user job, audience, files and components to reuse
  • Standards it must judge itself against: direction, constraints, review criteria
  • The output shape: prototype, component, audit, or production code — and where it lands
  • The closing instruction: plan first, restate the task, do not build until approved

A brief has two jobs: give the agent the facts to work inside the project, and the standards to judge its own output.

Slide notes

Keep this slide at the level of what the step produces, because Module 3 spends a full session on how to write it well. The brief is a written artifact — usually under a page — that carries everything specific to this task: the user job, the audience, the direction for this surface, the constraints that bite here, the output shape, and the review criteria. It explicitly does not restate the design system; durable rules live in the harness, and the brief points at them.

The two-jobs framing from the briefing article is worth saying out loud: facts and standards. Facts are what the agent needs to work inside the project — which page, which components, which tokens, which files. Standards are what it must use to judge its own output — what counts as a bad answer, what the review will check, which anti-patterns are off the table. Most weak briefs supply some facts and no standards, which is why the output looks plausible and fails review.

The last bullet is the bridge to the next two slides. Ending the brief with plan first and a request to restate the task is what turns the brief into the start of a loop rather than a fire-and-forget request. The restatement alone catches a surprising share of misunderstandings, because the agent has to show you what it thinks the job is before it spends any time on it.

Narration for this slide

Step one: the brief. In session terms it is a short written artifact, not a long chat message — usually under a page, often seven lines. It does two jobs. It gives the agent the facts it needs to work inside your project: the page, the user job, the audience, the components and tokens to reuse. And it gives the agent the standards to judge its own output: the direction, the constraints, and the review criteria you will hold it to. It does not restate your design system — that lives in the harness. And it ends with one instruction that changes everything downstream: plan first, restate the task, do not build until I approve.

Slide 4 of 1316:9

The plan gate: how the four platforms enforce it

Plan-before-build used to be a prompting habit. It is now a documented, enforceable mode in every major CLI agent.

PlatformHow the gate worksWhat the designer does
Claude CodePlan mode is a read-only permission mode; the plan is presented for approval before executionSwitch to plan mode, paste the brief, review and approve
Codex CLIPlan mode gathers context and asks clarifying questions; PLANS.md keeps a living plan beside long workReview the plan or the PLANS.md file before letting it implement
OpenCodeA built-in read-only Plan agent runs alongside the Build agentRun the brief through the Plan agent, then hand the approved plan to Build
Gemini CLIPlan Mode keeps the session read-only during analysis, then requires explicit approvalApprove the finished plan before anything is implemented

The platform — not your prompt — keeps the agent read-only until you approve the approach. Let the tool enforce the gate.

Slide notes

The point of this slide is that the review gate from Module 1 is not a discipline you have to maintain by willpower. Every major CLI agent now ships a mode that keeps the session read-only until the plan is approved: Claude Code's plan mode is a permission mode, Codex CLI documents a plan mode plus the PLANS.md pattern of keeping the plan as a reviewable file, OpenCode makes planning a separate read-only agent role, and Gemini CLI's Plan Mode requires explicit approval of the finished plan. The capability is stable even though entry points and shortcuts change; the school's briefing article keeps the documentation links current.

What the designer reviews at this gate is also worth restating, because it is the cheapest correction point in the whole loop. A good plan restates the user job in a sentence, names the sections and hierarchy, names the components it will reuse, lists the states and checks, and surfaces assumptions and contradictions. You are not reviewing code — there is none. You are reviewing the agent's judgment about structure and intent before that judgment becomes work.

Give the honest caveat early: a plan previews judgment, it does not bind the implementation. In the field-notes run from Module 1, the approved plan named the right components and the implementation still invented a prop the component does not have. The type check caught it, not the plan. That is why the loop has a second gate.

Narration for this slide

The plan gate used to be a habit you had to remember. Now it is a feature. Claude Code has plan mode as a read-only permission mode. Codex CLI has plan mode plus the PLANS.md pattern, where the plan lives as a file beside the work. OpenCode gives you a separate read-only Plan agent. Gemini CLI's Plan Mode keeps the session read-only and asks for explicit approval. In every case, the platform — not your prompt — stops the agent from building until you approve. What you review is the agent's judgment: did it restate the right job, the right structure, the right components? One caution: an approved plan does not guarantee the artifact. That is what the second gate is for.

Slide 5 of 1316:9

Step 2 — Generate: scoping the run

Generation is the step the agent runs mostly unattended. Your work happens before it starts: deciding the scope and where the output lands.

  • One run, one bounded outcome — a section, a component, an audit, not a redesign
  • Output lands on a scratch branch or scratch folder, never directly on main
  • The output shape was named in the brief: code, prototype, report, or comparison
  • The agent reuses existing components and tokens before inventing anything new

If you cannot say in one sentence what the run should produce, the run is too big. Split it before you start it.

Slide notes

Generation is the step people picture when they imagine agentic work, and it is the step where the designer does the least. The skill is in the setup. The first decision is scope: one run should produce one bounded outcome you could name in a sentence — the anatomy strip for the field-notes page, the empty state for the upload flow, an audit of the settings forms. Runs scoped to redesign the dashboard produce sprawling changes that nobody can review honestly, and the review gate quietly dies because reading the whole diff is too expensive.

The second decision is where the output lands. Scratch branches and scratch folders are not bureaucracy; they are what makes the rest of the loop honest. If the work lands somewhere disposable, critique can be ruthless and a bad run costs nothing but the minutes it took. In the field-notes case study, both runs were written to a scratch folder and deleted afterwards — whether the strip ever ships is an editorial decision, not a by-product of running the loop.

The last bullet about reusing existing components is the single highest-leverage constraint you can put on a generation run, and it usually lives in the harness rather than the brief. An agent told nothing will build new primitives; an agent pointed at SectionBand, SectionHeading, and the token file will mostly stay inside them. When it does not, the audit catches it — which is the next step.

Narration for this slide

Step two: generate. This is the part the agent does mostly on its own, so your work happens before it starts. Two decisions matter. First, scope: one run, one bounded outcome — a section, a component, an audit. If you cannot say in one sentence what the run should produce, split it. Second, where the output lands: a scratch branch or a scratch folder, never directly on main. That is what lets you critique honestly and throw away a bad run without ceremony. The brief already named the output shape and the components to reuse, so while the agent works, you do not need to watch every keystroke. You need to be ready to look hard at what comes back.

Slide 6 of 1316:9

Step 3 — Critique: criteria, checks, and judgment

Critique is not a vibe check. It compares the artifact against the criteria the brief stated before generation.

  • Start from the review criteria written in the brief — not from general taste language
  • Run the executable checks first: token audit, type check, verify script
  • Use screenshot evidence at desktop and mobile widths, with real content lengths
  • Findings get severity — blocker, important, polish, question — not a flat list
  • Keep critique read-only: findings first, fixes only after you approve them

The agent owns inspection and consistency checking. You own judgment about hierarchy, tone, and whether the thing serves the user job.

Slide notes

This step gets a full module later in the course, so keep it to the working shape. Critique starts from the review criteria the brief stated before anything was generated — that is the critique contract. Asking an agent for general feedback gets you general taste language: clean, modern, needs polish. Asking it to inspect the artifact against a named user job, with screenshots and the design files as evidence, gets you findings you can act on.

The order matters. Executable checks run first because they are free and unambiguous: the token audit, the type check, the project's verify script. In the field-notes runs, those checks did real work — the audit caught ten hardcoded colours in the weak run, the type check caught the invented prop in the briefed run. Then comes the evidence-based review: screenshots at desktop and mobile widths, with content lengths that are realistic rather than convenient. Then the human judgment that no check covers: does the hierarchy read in the right order, does the copy hold the product's voice, does the section actually serve the job the brief named.

Two working rules keep critique useful. Findings carry severity — blocker, important, polish, question — so you triage instead of wading through an undifferentiated list. And critique stays read-only: the agent reports findings, you decide which ones matter, and only the approved findings go into the revision pass. An agent that critiques and fixes in the same breath takes the decision away from you.

Narration for this slide

Step three: critique, and it is not a vibe check. You compare the artifact against the criteria the brief wrote down before anything existed. Run the executable checks first — the token audit, the type check, the verify script — because they are free and they catch real problems. Then look at screenshot evidence: desktop, mobile, real content lengths. Then apply the judgment no check covers: hierarchy, tone, whether the thing serves the user job. Two rules keep this useful. Every finding gets a severity — blocker, important, polish, or question. And critique stays read-only: findings first, fixes only after you approve them. The agent inspects. You judge.

Slide 7 of 1316:9

Step 4 — Revise: what the agent can act on

Revision quality tracks feedback quality. Some feedback is executable; some needs a human decision before the agent can do anything useful with it.

Feedback the agent can act onFeedback that needs a human decision first
"Move the plan summary above the payment form on mobile""This doesn't feel trustworthy enough"
"Replace the hardcoded hex values with the semantic tokens the audit flagged""Maybe we should rethink the whole checkout flow"
"Use the label prop on SectionHeading; the badge prop does not exist""Legal might not like how we describe the renewal"
"Keep each step to one line; link out for depth""Which of these two layouts is more on-brand?"
"Fix only the approved findings; do not introduce a new direction""Is this even the right feature to build?"

Translate judgment into instruction before handing it back. The agent executes decisions; it does not make them for you.

Slide notes

Revision is where the loop either converges or thrashes, and the difference is almost always the feedback. The left column is feedback the agent can execute directly: it names a location, a change, and implicitly a way to verify it. The right column is real feedback too — often the more important kind — but it contains an unmade decision. Handing it to the agent unprocessed produces one of two failure modes: the agent guesses at the decision and you get a new direction you did not ask for, or it makes a token gesture at the words and nothing meaningful changes.

The working habit is translation. When your reaction is this does not feel trustworthy, the design work is figuring out why — the price is hidden below the fold, the renewal terms are vague, the error state is a dead end — and then handing the agent the specific, checkable version. That translation is not overhead; it is the same skill as writing good critique for a human team, and it is exactly the judgment work that stays with the designer.

The other discipline at this step is scope. The revision pass acts only on approved findings, does not address rejected ones, and does not introduce a new visual direction. The critique article's revision prompt makes this explicit, and it is worth borrowing verbatim: list what changed and what still needs review, then recapture the same screenshots used in critique so the next look is a comparison rather than a fresh impression. Recurring fixes — the kind you have now typed three sessions in a row — are a signal that a rule belongs in the harness, which is where Module 4 picks up.

Narration for this slide

Step four: revise. The quality of this step is set entirely by the quality of your feedback. Look at the two columns. On the left, feedback the agent can act on: move this above that, replace these hex values with tokens, use the prop that actually exists. On the right, feedback that hides an unmade decision: it doesn't feel trustworthy, maybe rethink the flow, is this on-brand? The agent cannot make those decisions — if you hand them over raw, it will guess. So translate: figure out why it does not feel trustworthy, then hand over the specific change. Keep the revision scoped to approved findings only. And when you notice the same fix recurring, that is a rule asking to live in the harness.

Slide 8 of 1316:9

Step 5 — Ship: the decision and the record

Ship is a human decision, made against the brief, with the evidence in front of you. It is also the moment to record what the loop learned.

  • Passing checks are necessary, not sufficient — the decision is yours, not the audit's
  • Decide: ship it, hold it, or send it back for one more bounded round
  • Record what shipped, what was cut, and why — a sentence or two beside the work
  • Move recurring feedback into the harness so the next loop starts smarter

The loop does not end at merge. It ends when what you learned is written somewhere the next session can read.

Slide notes

Ship is deliberately the shortest step and the one with the clearest owner. The checks passing is the entry condition, not the decision. The decision is whether this artifact, with this evidence, against this brief, should go in front of users or stakeholders with your name attached. Sometimes the honest answer is hold — the work is fine but the timing or the surrounding product is not — and sometimes it is one more bounded round, with a specific finding, not a general wish for better.

The recording habit is what separates teams that get compounding value from the loop from teams that get a series of one-off wins. Two things are worth writing down, and neither takes more than a few minutes. First, a short note beside the work: what shipped, what was cut, and why — the kind of trail the critique loop naturally produces if you kept findings and severities. Second, and more important, the recurring feedback gets promoted into the harness: the anti-pattern you have rejected three times, the component the agent keeps forgetting exists, the tone rule you keep restating. That promotion is the dashed line on the loop diagram, and it is the mechanism by which next month's first drafts get better than this month's.

If nothing else from this slide survives, keep the accountability point. The agent generated the work, the checks verified parts of it, and the decision to ship it is still a person's. That was true in Module 1 as a principle; here it is true as the literal last step of every session.

Narration for this slide

Step five: ship. The checks passing gets you to the decision; it is not the decision. You look at the artifact, the evidence, and the brief, and you choose: ship it, hold it, or send it back for one more bounded round. Then two minutes of recording. Write a short note beside the work — what shipped, what was cut, why. And promote the recurring feedback into the harness: the anti-pattern you rejected for the third time, the component the agent keeps forgetting. That is the dashed line on the loop diagram, and it is why next month's first drafts are better than this month's. The loop ends when the learning is written down, not when the code merges.

Slide 9 of 1316:9

Show early, iterate small

Long unsupervised runs drift. Short runs with frequent checkpoints stay reviewable — and stay yours.

  • Drift compounds: a small early misread becomes a large late rework
  • Review effort is not linear — a 40-file change is not 4 times harder to review than a 10-file change, it is unreviewable
  • Checkpoints are cheap for the agent: ask for the riskiest part first, or one section before the rest
  • Small iterations keep your taste in the work; long runs replace it with the agent's defaults

The question is not how much can the agent do in one run. It is how much can you honestly review at the next gate.

Slide notes

This slide names the habit that keeps the loop healthy at session scale: show early, iterate small. The failure it prevents is drift. An agent that runs unsupervised for a long stretch makes a chain of small judgment calls — a layout choice here, an invented label there, a component it decided to write rather than reuse — and each one is individually reasonable while the sum walks away from your intent. By the time you look, the rework is large enough that the temptation is to accept it as is, which is how generic work ships.

The second bullet is the structural reason this matters: review capacity, not generation capacity, is the bottleneck of the whole loop. The agent can produce a forty-file change faster than you can read it honestly. A change you cannot review is a change you either rubber-stamp or discard, and both outcomes waste the run. Sizing runs to what the next gate can actually absorb is the practical version of holding the gates.

The tactic is simple and the agent does not mind it: ask for the riskiest or most uncertain part first — the layout approach, one representative card, the empty state — look at it, then let the rest follow the corrected pattern. This is the same instinct as showing a stakeholder a direction before polishing all five screens. It costs a few minutes of latency per session and saves the forty-minute correction loops the Module 1 case study put numbers on.

Narration for this slide

Here is the habit that keeps the loop healthy: show early, iterate small. Long unsupervised runs drift — a chain of small, individually reasonable judgment calls that adds up to something that is not what you meant. And the bottleneck is not the agent's speed, it is your review capacity. A forty-file change is not four times harder to review than a ten-file change; it is effectively unreviewable, and unreviewable work either gets rubber-stamped or thrown away. So size the run to what you can honestly look at. Ask for the riskiest part first, correct it, and let the rest follow the corrected pattern. The agent does not mind. Your taste stays in the work.

Slide 10 of 1316:9

Where the loop breaks

Three failures account for most of the bad sessions. Each has a specific fix.

  • Vague briefs — the agent fills the gaps with generic patterns; fix: write the user job and review criteria before you prompt
  • Skipped gates — fast, plausible output goes straight to stakeholders; fix: let plan mode and a scratch branch enforce the gates for you
  • Feedback that never reaches the harness — the same correction every session; fix: promote recurring fixes into rules at the ship step
  • The tell for all three: you are repeating yourself, round after round

Every break has the same shape: design work that should have happened once, early, paid for repeatedly, late.

Slide notes

These three failures are the ones the school sees most often in teams adopting agents, and each one maps to a specific point in the loop. Vague briefs break the loop at the start: the agent fills missing intent with the most common pattern on the public web — the marketing band, the gradient, the invented call-to-action from the Module 1 case study — and the session becomes a series of corrections that re-supply, one round at a time, the context the brief never carried. The fix is not a longer prompt; it is doing the design thinking — user job, direction, review criteria — before generation rather than during it.

Skipped gates break the loop in the middle. The output is fast and plausible, which is exactly what makes it dangerous unreviewed; the problems surface in front of stakeholders, or in production, where they cost the most. The fix is to make the gates structural rather than willpower-based: plan mode keeps the agent read-only until you approve, and a scratch branch means nothing reaches main without a deliberate decision. People skip rituals when busy; they skip enforced modes much less.

Feedback that never reaches the harness breaks the loop across sessions. Each individual session looks fine, but the same corrections recur because they live only in scrolled-away conversations — the agent has no taste memory. The fix is the recording habit from the ship step: when a correction shows up for the second or third time, it gets promoted into the harness or a skill. The shared tell across all three failures is repetition: if you notice you are typing the same thing again, the loop is leaking somewhere upstream of where you are standing.

Narration for this slide

Three failures account for most bad sessions. First, vague briefs: the agent fills the gaps with generic patterns, and you spend the session re-supplying context one correction at a time. The fix is writing the user job and review criteria before you prompt. Second, skipped gates: fast, plausible output goes straight to stakeholders, and the problems surface where they are most expensive. The fix is structural — plan mode and a scratch branch enforce the gates even when you are busy. Third, feedback that never reaches the harness: the same correction, every session, because the agent has no memory of yesterday. The fix is promoting recurring fixes into rules at the ship step. The tell for all three is the same: you are repeating yourself.

Slide 11 of 1316:9

Worked trace: one loop session, end to end

The field-notes task from Module 1, traced as a session: roughly thirty minutes, two gates, one fix round.

  • 0:00–0:10 — brief written; most of the time is deciding what the section is for
  • 0:10–0:14 — plan reviewed; the agent flags a contradiction, one sentence resolves it
  • 0:14–0:21 — generation run on a scratch folder, reusing the named components
  • 0:21–0:26 — screenshot critique; audit and type check run, the type check finds an invented prop
  • 0:26–0:31 — scoped revision, then the ship decision and a note for the harness
Timeline trace of one designer-agent loop session across six stages. The designer writes the brief in the first ten minutes, then reviews the agent's plan at a human gate where a contradiction in the brief is flagged and resolved. The agent runs generation on a scratch branch, the result passes through a screenshot critique at a second human gate where the audit and type check run and an invented component prop is caught, the agent runs a scoped revision, and the designer makes the ship decision and records what feeds back into the harness. Each stage is annotated with what the human does and what the agent does, with approximate timestamps from a roughly thirty-minute session.
Brief, the plan review, the screenshot critique, and the ship decision are human-led; generation and revision are agent-run. The timestamps come from the briefed field-notes run described in Module 1 — a real-shaped session, not a benchmark.

Of the thirty-one minutes, the agent worked for about ten. The rest was design thinking, review, and a decision — the parts that were always the job.

Slide notes

Walk the trace left to right and call out the ownership at each stage, because the diagram is really a time-and-attention budget. The brief takes the first ten minutes, and most of that is not typing — it is deciding what the section is for, which is design work the one-line request from the weak run simply skipped. The plan review is four minutes: the agent restated the user job, named the components it would reuse, and flagged a real contradiction in the brief; resolving it cost one written sentence, before any code existed.

Generation is about seven minutes of agent time on a scratch folder, during which the designer is not watching every keystroke. The screenshot critique is the second gate: the agent runs the token audit and the type check and captures desktop and mobile screenshots; the human judges hierarchy, tone, and states against the criteria the brief wrote down. This is where the invented component prop surfaced — a TS2322 the plan had no way to prevent — which is the concrete argument for keeping both gates rather than trusting the plan. The revision is three minutes and scoped to that one finding, and the ship decision plus the harness note closes the session at roughly half an hour.

Make the honest caveats explicit. These timestamps describe one bounded task on a small, well-harnessed site; a heavier task stretches every stage, especially the brief and the critique. And the numbers come from the school's traced run, not a benchmark — the proportions are the lesson, not the absolute minutes. Of the thirty-one minutes, agent time is about ten; the other twenty are thinking, reviewing, and deciding, which is a fair preview of what the job feels like when the loop is working.

Narration for this slide

Let's trace one real-shaped session, the field-notes task from Module 1, with timestamps. Ten minutes writing the brief — most of it deciding what the section is for. Four minutes reviewing the plan, where the agent flagged a contradiction and one sentence resolved it. Seven minutes of generation on a scratch folder while you do something else. Five minutes of screenshot critique, where the audit and the type check ran and the type check caught an invented prop the plan never could have. Three minutes of scoped revision, then the ship decision and a note for the harness. About thirty minutes total — and the agent worked for ten of them. The rest was design thinking, review, and a decision. That was always the job.

Slide 12 of 1316:9

Exercise: run your task through a plan-only session

Take the brief you sketched in Module 1 and run it through one plan-only session. Nothing gets built — the agent stays read-only the whole time.

  • Open your platform's plan mode — Claude Code plan mode, Codex plan mode, OpenCode's Plan agent, or Gemini CLI Plan Mode
  • Paste your Module 1 brief and ask for a plan plus a restatement of the user job
  • Ask the agent to list assumptions, contradictions, and underspecified states in the brief
  • Mark each plan item: would you approve it, correct it, or send the brief back for revision?
  • Note one thing the plan got wrong that your brief caused, and one thing it got wrong on its own

Treat a plan that raises no questions with suspicion. Real briefs have gaps — an agent that finds none is agreeing, not reading.

Slide notes

This exercise is the first time in the course participants touch an actual agent, and it is deliberately constrained to plan mode so the cost of getting it wrong is zero: nothing is built, nothing is edited, and the whole session is reading and judging. The input is the one-page brief from the Module 1 exercise; if someone skipped that, ten minutes writing it now is part of the exercise rather than a blocker.

The two questions at the end carry the learning. One thing the plan got wrong that your brief caused surfaces the gap between what people wrote and what they meant — usually a missing user job, an unstated constraint, or review criteria that exist only in their head. One thing it got wrong on its own surfaces the agent's defaults: the generic pattern it reached for, the state it ignored, the component it planned to invent. Distinguishing those two failure sources is the skill the rest of the course sharpens — the first is fixed in the brief, the second is fixed in the harness or caught at a gate.

If running this live, have a few people read out the contradictions or assumptions their agent flagged. The variety makes the point better than any slide: the same exercise produces different gaps for every brief, and a plan that flagged nothing usually means the brief was so loose the agent had nothing to push against. Keep the page and the plan; Module 3 rewrites this brief properly and runs it for real.

Narration for this slide

Now you run the loop — but only the front half. Take the brief you sketched in Module 1, open your platform's plan mode, and paste it in. Ask for a plan, a restatement of the user job, and a list of assumptions, contradictions, and underspecified states. The agent stays read-only the whole time, so nothing can go wrong that costs more than a few minutes. Then judge the plan: what would you approve, what would you correct, what sends the brief back for revision? Write down one thing the plan got wrong because of your brief, and one thing it got wrong on its own. Keep both — Module 3 fixes the first kind, and the gates handle the second.

Slide 13 of 1316:9

Summary, and what comes next

  • The loop is a working rhythm: a bounded session of roughly twenty to forty minutes, not a process document
  • Treat the agent as a junior partner — fast, literal, eager, no taste memory — and brief it accordingly
  • Plan mode and scratch branches make the gates structural instead of willpower-based
  • Critique runs against criteria written before generation; revision acts only on approved, executable feedback
  • The loop breaks at vague briefs, skipped gates, and feedback that never reaches the harness — and the tell is repetition

Module 3 goes deep on the step that sets the ceiling for everything else: the brief, and how to write one that carries real design intent.

Slide notes

Recap by walking the loop one more time, but at the speed of the worked trace rather than the diagram: ten minutes of brief, a four-minute plan review, an unattended generation run, a critique built on checks and screenshots, a scoped revision, and a human ship decision with a note for the harness. The proportions are the message — most of the half hour is design thinking and judgment, and the agent's share is the production in the middle.

Reinforce the two structural ideas that make the rhythm sustainable. The gates are enforced by the platform and the branch strategy, not by discipline, which is why they survive busy weeks. And the loop compounds only if the ship step writes something down: recurring feedback promoted into the harness is what makes session fifty meaningfully better than session five.

Preview Module 3 concretely. Everything in this module assumed the brief existed and was good; the next module is about making that true — the seven-line canvas, vague versus specific prompts with the costs measured, turning taste adjectives into behavioural constraints, briefing states rather than just the happy path, and the brief review prompt that makes the agent find the gaps before you do. Participants who did the exercise already have the raw material: the plan-only session showed them exactly where their current brief leaks.

Narration for this slide

Let's close. The loop is a rhythm, not a ceremony — a bounded session of twenty to forty minutes where the agent runs production and you hold the gates. Treat the agent as a junior partner: fast, literal, eager, and with no taste memory, which is why the brief and the harness matter so much. Let plan mode and scratch branches enforce the gates so they survive your busiest week. Critique against criteria you wrote before generation, revise only on approved findings, and write down what the loop learned before you call it shipped. The loop breaks at vague briefs, skipped gates, and feedback that never reaches the harness. Module 3 attacks the first of those head on: how to write a brief that carries real design intent. See you there.

Module transcript
Module 2, narrated slide by slide

Slide 1The Designer–Agent Loop

Welcome back. In Module 1 we drew the loop: brief, plan, review gate, generate, critique, revise, ship. In this module we slow it down to working speed. Not the diagram — the actual session. What you type, what comes back, how long each step takes, where you have to pay attention and where you can walk away. We will look at each step as a concrete activity, see how plan modes enforce the gates across the four platforms, trace one real session end to end with timestamps, and then look honestly at the three places the loop most often breaks. By the end, the loop should feel like a rhythm you could start tomorrow.

Slide 2The agent as a junior designer

Before we walk the steps, fix the mental model. The agent is best treated as a junior design partner: fast, literal, eager, and with no taste memory. Fast means a wrong run costs minutes, not days. Literal means it does what your words say, not what you meant — it cannot infer the politics or the polish level. No taste memory means yesterday's correction is gone unless it lives in a file. And eager means it would rather hand you something plausible than ask a clarifying question. You would never give a junior a one-line request and then judge them on the output. The agent needs the same working agreement, and the loop is that agreement.

Slide 3Step 1 — Brief: what the step produces

Step one: the brief. In session terms it is a short written artifact, not a long chat message — usually under a page, often seven lines. It does two jobs. It gives the agent the facts it needs to work inside your project: the page, the user job, the audience, the components and tokens to reuse. And it gives the agent the standards to judge its own output: the direction, the constraints, and the review criteria you will hold it to. It does not restate your design system — that lives in the harness. And it ends with one instruction that changes everything downstream: plan first, restate the task, do not build until I approve.

Slide 4The plan gate: how the four platforms enforce it

The plan gate used to be a habit you had to remember. Now it is a feature. Claude Code has plan mode as a read-only permission mode. Codex CLI has plan mode plus the PLANS.md pattern, where the plan lives as a file beside the work. OpenCode gives you a separate read-only Plan agent. Gemini CLI's Plan Mode keeps the session read-only and asks for explicit approval. In every case, the platform — not your prompt — stops the agent from building until you approve. What you review is the agent's judgment: did it restate the right job, the right structure, the right components? One caution: an approved plan does not guarantee the artifact. That is what the second gate is for.

Slide 5Step 2 — Generate: scoping the run

Step two: generate. This is the part the agent does mostly on its own, so your work happens before it starts. Two decisions matter. First, scope: one run, one bounded outcome — a section, a component, an audit. If you cannot say in one sentence what the run should produce, split it. Second, where the output lands: a scratch branch or a scratch folder, never directly on main. That is what lets you critique honestly and throw away a bad run without ceremony. The brief already named the output shape and the components to reuse, so while the agent works, you do not need to watch every keystroke. You need to be ready to look hard at what comes back.

Slide 6Step 3 — Critique: criteria, checks, and judgment

Step three: critique, and it is not a vibe check. You compare the artifact against the criteria the brief wrote down before anything existed. Run the executable checks first — the token audit, the type check, the verify script — because they are free and they catch real problems. Then look at screenshot evidence: desktop, mobile, real content lengths. Then apply the judgment no check covers: hierarchy, tone, whether the thing serves the user job. Two rules keep this useful. Every finding gets a severity — blocker, important, polish, or question. And critique stays read-only: findings first, fixes only after you approve them. The agent inspects. You judge.

Slide 7Step 4 — Revise: what the agent can act on

Step four: revise. The quality of this step is set entirely by the quality of your feedback. Look at the two columns. On the left, feedback the agent can act on: move this above that, replace these hex values with tokens, use the prop that actually exists. On the right, feedback that hides an unmade decision: it doesn't feel trustworthy, maybe rethink the flow, is this on-brand? The agent cannot make those decisions — if you hand them over raw, it will guess. So translate: figure out why it does not feel trustworthy, then hand over the specific change. Keep the revision scoped to approved findings only. And when you notice the same fix recurring, that is a rule asking to live in the harness.

Slide 8Step 5 — Ship: the decision and the record

Step five: ship. The checks passing gets you to the decision; it is not the decision. You look at the artifact, the evidence, and the brief, and you choose: ship it, hold it, or send it back for one more bounded round. Then two minutes of recording. Write a short note beside the work — what shipped, what was cut, why. And promote the recurring feedback into the harness: the anti-pattern you rejected for the third time, the component the agent keeps forgetting. That is the dashed line on the loop diagram, and it is why next month's first drafts are better than this month's. The loop ends when the learning is written down, not when the code merges.

Slide 9Show early, iterate small

Here is the habit that keeps the loop healthy: show early, iterate small. Long unsupervised runs drift — a chain of small, individually reasonable judgment calls that adds up to something that is not what you meant. And the bottleneck is not the agent's speed, it is your review capacity. A forty-file change is not four times harder to review than a ten-file change; it is effectively unreviewable, and unreviewable work either gets rubber-stamped or thrown away. So size the run to what you can honestly look at. Ask for the riskiest part first, correct it, and let the rest follow the corrected pattern. The agent does not mind. Your taste stays in the work.

Slide 10Where the loop breaks

Three failures account for most bad sessions. First, vague briefs: the agent fills the gaps with generic patterns, and you spend the session re-supplying context one correction at a time. The fix is writing the user job and review criteria before you prompt. Second, skipped gates: fast, plausible output goes straight to stakeholders, and the problems surface where they are most expensive. The fix is structural — plan mode and a scratch branch enforce the gates even when you are busy. Third, feedback that never reaches the harness: the same correction, every session, because the agent has no memory of yesterday. The fix is promoting recurring fixes into rules at the ship step. The tell for all three is the same: you are repeating yourself.

Slide 11Worked trace: one loop session, end to end

Let's trace one real-shaped session, the field-notes task from Module 1, with timestamps. Ten minutes writing the brief — most of it deciding what the section is for. Four minutes reviewing the plan, where the agent flagged a contradiction and one sentence resolved it. Seven minutes of generation on a scratch folder while you do something else. Five minutes of screenshot critique, where the audit and the type check ran and the type check caught an invented prop the plan never could have. Three minutes of scoped revision, then the ship decision and a note for the harness. About thirty minutes total — and the agent worked for ten of them. The rest was design thinking, review, and a decision. That was always the job.

Slide 12Exercise: run your task through a plan-only session

Now you run the loop — but only the front half. Take the brief you sketched in Module 1, open your platform's plan mode, and paste it in. Ask for a plan, a restatement of the user job, and a list of assumptions, contradictions, and underspecified states. The agent stays read-only the whole time, so nothing can go wrong that costs more than a few minutes. Then judge the plan: what would you approve, what would you correct, what sends the brief back for revision? Write down one thing the plan got wrong because of your brief, and one thing it got wrong on its own. Keep both — Module 3 fixes the first kind, and the gates handle the second.

Slide 13Summary, and what comes next

Let's close. The loop is a rhythm, not a ceremony — a bounded session of twenty to forty minutes where the agent runs production and you hold the gates. Treat the agent as a junior partner: fast, literal, eager, and with no taste memory, which is why the brief and the harness matter so much. Let plan mode and scratch branches enforce the gates so they survive your busiest week. Critique against criteria you wrote before generation, revise only on approved findings, and write down what the loop learned before you call it shipped. The loop breaks at vague briefs, skipped gates, and feedback that never reaches the harness. Module 3 attacks the first of those head on: how to write a brief that carries real design intent. See you there.