AAgentic Design School
Module 3 of 6
45–55 minutes

Agentic Design Fundamentals

Briefing and Design Intent

The brief is where design intent becomes executable. This module covers the anatomy of a design brief, the difference between vague and specific prompts in measurable terms, and how research packets feed briefs that an agent can actually act on.

Duration45–55 minutes

Slides13 slides with notes and narration

Learning objectives

  • Diagnose why vague prompts produce generic output, using before-and-after evidence.
  • Write a seven-line design brief covering situation, user job, audience, direction, constraints, output shape, and review criteria.
  • Translate taste and visual direction into behavioural constraints an agent can follow.
  • Assemble a research packet as briefing input for larger tasks.
Slide deck

Work through the module

Each slide is shown in its 16:9 frame, exactly as it appears in the video version. Open the notes under any slide for the longer explanation, and the narration if you prefer to read along.

Slide 1 of 1316:9

Briefing and Design Intent

Agentic Design Fundamentals · Module 3 of 6

  • Why vague prompts buy generic output — with the receipts
  • The seven-line briefing canvas, line by line
  • Turning taste into constraints an agent can follow
  • Research packets, brief reviews, and what belongs in the harness instead

The brief is the design surface. Everything you do not decide in it, the agent decides for you — with the most common answer it knows.

Slide notes

This module is where the course stops describing the loop and starts equipping the first step of it. Module 1 introduced the loop, Module 2 walked it at working speed; this module goes deep on the single highest-leverage artifact in it: the brief. The framing to establish up front is that briefing is not paperwork before the design work — it is the design work, moved earlier, where it is cheap to change.

Set the evidence expectation. The claims in this module are backed by two traced runs from the school's published articles: the field-notes strip on this site, run with a one-line request and again with a full brief, and a pricing-page task run with a vague prompt and a specific one. The numbers quoted are from those specific runs, not benchmarks, and the slides say so. That honesty matters — participants who have been burned by AI hype will trust measured small claims over sweeping ones.

Also name what this module will not do. It will not teach prompt tricks or magic phrasings, and it will not pretend a good brief substitutes for product judgment. The brief encodes decisions you have already made; making those decisions is still your job.

Narration for this slide

Welcome back. Module 3 is about the brief — the place where design intent becomes something an agent can actually execute. We are going to look at real evidence of what vague prompts cost, then build the seven-line briefing canvas line by line: what each line carries, and what the agent does with it. We will cover how to brief taste without resorting to adjectives, how to brief states and review criteria, when a research packet earns its place, and what should never be in the brief at all because it belongs in the harness. The honest premise of the whole module: anything you do not decide, the agent decides for you.

Slide 2 of 1316:9

Vague vs specific: the same task, run both ways

A pricing page for a design workflow product, generated twice on an identical scratch target. The prompt was the only variable.

Vague promptSpecific prompt
The request26 words: "a modern pricing page... a nice hero, feature cards, and a call to action"87 words: buyer segments, the decision the page supports, hierarchy, density, banned patterns, criteria first
First outputGradient hero, three interchangeable cards, invented $9/$29/$99 prices, generic AI copyComparison-first layout, recommended plan with a stated reason, real limits in a semantic table
Token audit33 hardcoded-colour violationsNo violations
Iterations and time3 rounds, roughly 25 minutes — and the decision problem still untouched2 rounds, roughly 12 minutes, plus 2–3 minutes writing the prompt
Shared missNo mobile comparison behaviour — neither prompt asked for itSame. Specificity only buys the decisions you actually put in

The specific prompt is sixty words longer. It does not add detail — it changes the design problem from a page genre to a decision tool.

Slide notes

These numbers come from the school's vague-versus-specific article, where the two prompts were executed verbatim against the same scratch Next.js target with no harness loaded, so the prompt carried everything. Be precise about what is real: the generated code, the audit output, and the iteration counts are artifacts from the run; the elapsed-time figures are estimates of the equivalent interactive session; and the numbers describe these runs, not a benchmark.

Walk the failure on the left first. The vague prompt asked for a pricing page as a genre, and the agent delivered the genre average: gradient hero, three identical cards, fabricated prices, copy that could belong to any AI product. The dangerous part is that it looked confident — polished enough to screenshot, useless for the buyer's actual decision. Three revision rounds improved the surface, because the surface was all the prompt ever specified.

Then the right column. The specific prompt named the buyers, the decision the page supports, the hierarchy, the density, and the banned defaults — and asked for decision criteria before any code. The first draft was already about something. The audit pass deserves a caveat: it checks token discipline, not information design, so it is a partial gate, not a verdict.

Close on the bottom row, because it is the most instructive. Neither prompt said what the comparison should do on a phone, so neither output handled it. A prompt is a vehicle for decisions, not a replacement for making them.

Narration for this slide

Let's start with the evidence. Same task — a pricing page for a design workflow tool — run twice. The vague prompt, twenty-six words, came back as the genre average: gradient hero, three interchangeable cards, invented prices, and thirty-three hardcoded-colour violations. Three rounds later it was tidier, but the buyer's decision problem was untouched. The specific prompt, eighty-seven words, named the buyers, the decision, the hierarchy, and the banned defaults. The first draft was comparison-first, passed the audit, and needed two rounds. Here is the honest part: neither prompt said what the comparison does on a phone, so neither output handled it. A prompt only buys the decisions you put in it.

Slide 3 of 1316:9

Anatomy of the briefing canvas

Seven lines the designer writes, what the agent does with each one, and the durable rules that stay in the harness.

Diagram of the seven lines of a design brief — situation, user job, audience and context, direction and references, system constraints, output shape, and review criteria — each shown as a labelled row paired with a note on what the agent does with that line: locating the right files, restating the job in its plan, setting tone and density, turning adjectives into layout behaviour, reusing named components instead of inventing rules, writing output to a scratch folder at the right fidelity, and running checks on its own output before reporting done. A yellow strip carries the closing plan-first instruction, and a dark strip at the bottom lists what belongs in the harness instead: tokens, component inventory, banned patterns, and working rules in DESIGN.md, AGENTS.md or CLAUDE.md, and skills.
Each brief line maps to a behaviour: the situation scopes what the agent reads, the user job shapes hierarchy, constraints point at the harness, and review criteria become checks the agent runs on itself. Durable rules — tokens, components, anti-patterns, working rules — live in the harness so the brief stays task-shaped.

A line earns its place only if it would change what the agent builds. A line that would not change the output belongs in the harness — or nowhere.

Slide notes

Walk the rows top to bottom and keep returning to the right-hand column, because that is what makes this slide different from a generic brief template: every line exists because the agent does something specific with it. The situation tells the agent which files and surface to read before planning. The user job is what the plan should restate, and it is the line that decides hierarchy and what gets cut. Audience and context set tone, density, and how much polish the draft aims for — a stakeholder review and an internal spike do not deserve the same effort.

Direction and references are where adjectives go to be translated; the next slide covers that translation in detail. System constraints should mostly point at the harness — name the components and tokens to reuse, and let DESIGN.md carry the rest. Output shape decides where the work lands and at what fidelity; a scratch folder for explorations, a component for real work, an audit report when the deliverable is findings rather than pixels. Review criteria are the contract the agent checks itself against before reporting done.

The two strips at the bottom carry the habits. The closing plan-first instruction is what makes the review gate from Module 2 enforceable rather than hoped for. And the harness strip is the discipline that keeps briefs short: anything that would be true for every task does not belong in this one.

Narration for this slide

Here is the canvas itself. Seven lines, and the reason each one exists is on the right: what the agent does with it. The situation scopes what it reads. The user job becomes the sentence the plan restates, and it drives hierarchy. Audience sets tone and polish. Direction and references are where your adjectives get translated into behaviour. Constraints point at the harness and name the components to reuse. Output shape says where the work lands — usually a scratch folder, not main. And review criteria are written before generation, so the agent can check itself. Then the closing line: plan first, do not build until I approve. If a line would not change what gets built, it does not belong in the brief.

Slide 4 of 1316:9

References, constraints, and taste: turning adjectives into behaviour

Adjectives are too elastic for layout decisions. Translate visual direction into behaviour the agent can act on and a reviewer can check.

Weak detailBehavioural constraint
Make it clean and modernMatch the density and tone of the existing formats grid; no new visual treatments
Use our design systemReuse SectionBand, SectionHeading, and SimpleItemCard with semantic tokens before any custom markup
Premium feelFewer elements, stronger type hierarchy, more negative space, restrained colour
Keep it simpleThree steps maximum, one line each; link out for depth instead of explaining inline
Looks good on mobileThe grid stacks to one column below the md breakpoint and the heading order is preserved
Like this screenshotMatch its information density and table rhythm; do not copy its colours, brand, or copy

Every row on the right is something the agent can act on and a reviewer can check. That double test is what makes a constraint real.

Slide notes

This is the translation skill the whole module turns on. Words like clean, modern, premium, elegant, and delightful are fine as a starting mood, but they cannot decide between two layouts, which means the agent decides — and it decides with the most common pattern in its training data. The fix is to name density, hierarchy, rhythm, and restraint as behaviours: what should be seen first, second, and third; whether the surface is sparse or data-dense; what is explicitly off the table.

The restraint row matters most in practice. Anti-patterns named for this specific surface — no numbered-circle process graphics, no decorative gradients, no marketing register — do more work than positive direction, because they block the exact defaults the agent would otherwise reach for. If the same ban shows up in every brief you write, that is a signal it belongs in the harness instead; Module 4 picks that thread up.

The screenshot row deserves a comment of its own. References are evidence, not decoration: tell the agent which traits to read — density, hierarchy, interaction pattern — and which to ignore — brand, colour, copy. Without that instruction it will happily imitate the palette when what you wanted was the table rhythm.

Narration for this slide

Now the part where taste meets the keyboard. Adjectives do not survive contact with an agent — clean, modern, premium can each mean five different layouts, so the agent picks the most common one. The translation is always the same move: turn the adjective into behaviour. Premium becomes fewer elements, stronger hierarchy, more negative space. Simple becomes three steps maximum, one line each. On-brand becomes the named components and tokens to reuse. And references need instructions too: match this screenshot's density and rhythm, ignore its colours and copy. The test for every line is double — can the agent act on it, and can a reviewer check it. If the answer to either is no, rewrite it.

Slide 5 of 1316:9

Brief the states, not just the happy path

Most agent-built interfaces look acceptable because the prompt only described the default screen. Real products live in the other states.

  • Default: what the user sees before they have done anything
  • Empty and first-run: what teaches the next step when there is no data yet
  • Error: what recovery looks like, not just that something went wrong
  • Loading and pending: what the user is told while they wait
  • Success: what changed and what the next useful action is
  • Mobile and long-content: what must stay visible and in what order

States the brief never asked for will be invented as shallow placeholders — and the plan review cannot catch what the brief never mentioned.

Slide notes

This is the slide that separates briefs that survive contact with real products from briefs that only survive a demo. Static content sections get off lightly — the field-notes strip from the worked example only had to worry about mobile stacking and heading order, and one line in the brief covered it. Interactive surfaces do not get off lightly: an upload flow, a settings form, or a data table fails in its states, and an agent that was never asked about the empty list, the failed request, or the copy that runs three times longer than expected will invent placeholders that look fine and teach nothing.

The practical habit is small: write the states as a short list at the end of the brief, in the same breath as the review criteria, and require the plan to name how each one is handled. That gives the review gate something concrete to check — a plan that skips the error state is visibly incomplete, instead of quietly incomplete.

Connect this back to the pricing-page run from earlier in the module. The mobile gap existed not because the agent was careless but because nobody — the prompt author included — had decided what the comparison should do on a phone. State briefing is mostly the discipline of making those decisions before generation rather than discovering them in review.

Narration for this slide

Here is where most agent-built interfaces quietly fail: the states. The prompt described the happy path, so the happy path is what got designed. Real products live everywhere else — the empty list, the failed request, the loading wait, the success moment, the phone screen, the copy that runs long. The habit is simple. End every brief for an interactive surface with a short list of states, right next to the review criteria, and require the plan to say how each one is handled. That turns a missing error state from something you discover in review into something the plan visibly skipped. Remember the pricing page: the mobile gap was not the agent's failure. It was a decision nobody had made.

Slide 6 of 1316:9

The critique contract: review criteria written before generation

The brief should say how the work will be judged before any of it exists. Make as much of that judgment executable as you can.

  • Executable checks: token audit, type check, verify script — the agent runs these on itself before reporting done
  • Human pass/fail questions: does it read as part of the page, does the copy hold the voice, does mobile order make sense
  • Criteria written before generation cannot be bent to fit whatever came back
  • In the traced runs, the audit caught the weak run's hardcoded colours and the type check caught the briefed run's invented props

A passing audit plus a passing plan is still not a design review. The executable checks buy you time for the judgment only you can apply.

Slide notes

The critique contract is the part of the brief most people skip, because it feels like work that belongs at the end. Writing it first does two things. It forces you to define what good means for this task while you are still neutral — criteria written after generation have a way of bending to fit whatever the agent produced. And it gives the agent a self-check it can run before you ever look: the audits, type checks, and verify scripts that catch the mechanical failures automatically.

Use the evidence from the traced runs to show both halves doing real work. The token audit caught the weak field-notes run's ten hardcoded colours and the vague pricing run's thirty-three. The type check caught the briefed run's invented component props — a failure the approved plan looked right about and the implementation still got wrong. That split is the lesson: executable checks catch what is mechanical; the human criteria — does the strip read as part of the page, does the copy hold the site's voice, can a reader tell what the section is for in three seconds — are what your review time is actually for.

Warn against the inverse failure too: a brief whose criteria are all vibes. If every criterion needs a human to squint at the result, the agent cannot self-correct anything, and you become the bottleneck for failures a script could have caught.

Narration for this slide

Every brief should end with the critique contract: how the work will be judged, written before the work exists. Split it in two. The executable half — the token audit, the type check, the verify script — the agent runs on its own output before it reports done. In our traced runs those checks did real work: the audit caught the hardcoded colours, the type check caught the invented component props that a perfectly sensible plan did not prevent. The human half stays yours: does this read as part of the product, does the copy hold the voice, does the mobile order make sense. Writing the criteria first keeps them honest — criteria written afterwards always bend to fit what came back.

Slide 7 of 1316:9

Research packets as briefing inputs for bigger tasks

A seven-line brief carries a section or a component. A redesigned flow or a new surface needs evidence the agent can inspect, not just assert.

  • A packet is a small folder of files beside the work: brief, user job, evidence, screenshots, the approved plan, review criteria
  • Evidence means the research, tickets, or analytics behind the change — summarised, with sources and dates, not pasted wholesale
  • Separate what the source says from what you infer, and record a confidence level before it becomes design direction
  • Files beat one long message: the agent can quote them back, and the plan can be reviewed against them

The packet does not make the agent smarter. It makes the brief checkable — by the agent, by you, and by whoever reviews the work after you.

Slide notes

This slide scales the briefing canvas up without changing its nature. For the field-notes strip, the packet was minimal: the brief, a one-paragraph note on why the section exists, and a pointer to the existing page. For a redesigned onboarding flow or a new product surface, the brief alone cannot carry the weight — the user job needs the support tickets or research that motivated it, the current state needs screenshots, and the acceptance criteria deserve their own document. Keeping that material as a small folder of files beside the work means the agent can inspect and quote it, and the plan can be checked against it, instead of everything living in one long message that scrolls away.

Borrow the discipline from the school's research-packet practice: separate what the source directly says from what you infer, record where the evidence came from and when it was checked, and attach a confidence level. Low-confidence evidence should trigger a test or a follow-up, not a layout decision. This sounds heavy until the first time an agent designs around an inference you had quietly promoted to a fact.

Be clear about proportionality. A packet is overhead, and overhead has to earn itself. The seven lines are enough for most section-and-component work; the packet earns its folder when the task spans screens, when several people will review it, or when the evidence behind the change is genuinely contested.

Narration for this slide

When the task gets bigger than a section, the brief needs backup. That is the research packet: a small folder beside the work — the brief, the user job, the evidence behind the change, screenshots of the current state, the approved plan, and the review criteria as their own file. Two disciplines make it useful. Keep evidence honest: separate what the source actually says from what you are inferring, note where it came from, and mark how confident you are — low confidence should trigger a test, not a layout. And keep it as files, not one long message, so the agent can quote it back and the plan can be reviewed against it. For a single component, skip it. For a flow with real stakes, it earns the folder.

Slide 8 of 1316:9

What belongs in the brief vs what belongs in the harness

The harness carries everything durable. The brief carries only what is specific to this task. Mixing them up is the most common briefing failure.

Harness — loads every sessionBrief — written per task
Visual systemTokens, type scale, spacing, component inventory in DESIGN.mdWhich parts apply with extra force on this task
Working rulesVerify commands, scratch-branch habits, file conventions in AGENTS.md or CLAUDE.mdThe output shape and where this artifact lands
Anti-patternsBans that apply to every surface: hardcoded colours, slop patternsBans specific to this surface: no numbered-circle process graphics here
Audience and intentWho the product serves, in general termsThe user job and the decision this screen supports
ReviewThe audit and check commands that existThe pass/fail criteria for this task, including the human ones

If you have typed the same constraint into your last five briefs, it is a harness rule wearing a disguise. Move it, and stop typing it.

Slide notes

This boundary is what keeps briefs short enough to be read and harnesses stable enough to be trusted. The failure mode on one side is the brief that restates the design system: it adds noise, drifts out of date the moment the system changes, and trains everyone — including you — to skip reading briefs. The failure on the other side is a harness stuffed with task-specific instructions that only made sense once, which Module 4 deals with under the name harness bloat.

The diagnostic is repetition. Constraints that show up in every brief — token rules, banned gradients, component naming — are project rules in disguise and belong in DESIGN.md, AGENTS.md, or a skill. Context that is genuinely new each time — the user job, the evidence, the direction for this screen, the criteria — is what the brief is for. The field-notes brief is the model: it points at DESIGN.md and AGENTS.md by name, names three components to reuse, and spends its words on the job, the tone, and the criteria.

Flag the dependency honestly: this division assumes the harness exists. If a participant's project has no DESIGN.md and no instruction file yet, their briefs will be longer for now, and that is fine — the repetition they notice over the next few tasks is exactly the material Module 4 will teach them to move.

Narration for this slide

Here is the boundary that keeps briefs readable. The harness carries everything durable — tokens, components, working rules, the bans that apply to every surface — and it loads in every session. The brief carries only what is specific to this task: the user job, the direction for this screen, the criteria, the output shape. The diagnostic is simple: if you have typed the same constraint into your last five briefs, it is not task context, it is a project rule wearing a disguise. Move it to the harness and stop typing it. And if your project has no harness yet, your briefs will run long for now — notice what repeats, because that repetition is exactly what Module 4 will teach you to encode.

Slide 9 of 1316:9

Brief review prompts: asking the agent to find the gaps

Before the agent builds, make it review the brief. In plan mode this costs nothing — the agent is already reading the project without permission to change it.

  • The review surfaces assumptions, contradictions, and underspecified states while corrections cost one sentence
  • In the field-notes run, this step caught a real contradiction: a compact strip and detailed step descriptions cannot both be true
  • Treat a plan that raises no questions with mild suspicion — real briefs have gaps
Brief review prompt
Review this brief before building.

Return:
1. What user job you think this screen serves.
2. The design decisions that are clear.
3. The design decisions that are missing or contradictory.
4. The states that are underspecified.
5. The likely failure modes.
6. The implementation plan you would follow.

Do not build until I approve or revise the brief.

An agent that finds nothing wrong with your brief is usually pattern-matching its way to agreement, not reading carefully.

Slide notes

This is the cheapest quality step in the module, and the one people skip because it feels like asking for permission to be criticised. The mechanics are simple: in plan mode, paste the brief and the review prompt; the agent reads the project, restates the job, and lists what is clear, what is missing, what is contradictory, and which states are underspecified. The platform keeps it read-only, so the only cost is the two or three minutes it takes to read the response.

The field-notes run is the proof it earns its keep. The brief asked for a compact strip matching the formats grid and, in the same breath, for step descriptions detailed enough to explain the whole workflow. The plan flagged the conflict and proposed a resolution; agreeing took one written sentence. Without the review, that contradiction would have surfaced as a built section that was either too dense or too thin, and the fix would have cost a full revision round.

The suspicion rule is worth saying twice. Real briefs have gaps — yours will too — and an agent that reports none is usually agreeing rather than reading. Making the questions part of the deliverable, as the prompt does, is what separates a genuine review from a polite echo. It does not catch everything: the same run still produced invented component props that only the type check caught. The review gate filters intent; the executable gates filter implementation.

Narration for this slide

Before the agent builds anything, make it review the brief. The prompt is on the slide: restate the job, list what is clear, what is missing or contradictory, which states are underspecified, the likely failure modes, and the plan — and do not build until I approve. In plan mode this costs nothing, because the agent is already reading the project without write access. In the field-notes run this step caught a real contradiction — a compact strip and detailed step descriptions cannot both be true — and resolving it cost one sentence instead of a revision round. One warning: a review that finds nothing wrong is suspicious. Real briefs have gaps. An agent that sees none is agreeing, not reading.

Slide 10 of 1316:9

The seven-line canvas as a reusable template

One template for section- and component-level work. Heavier tasks add the packet; tiny tweaks skip the canvas entirely.

  • For a small component, each line is one sentence; for a page, lines may need examples and screenshots
  • Tiny tweaks and throwaway explorations skip the canvas — the cost of a generic answer is zero
  • Bigger tasks keep the same seven lines and add the research packet as files beside the work
  • The same spine appears in spec-driven development tooling; this is the lighter, design-flavoured version
Briefing canvas template
Situation: [what product, screen, or workflow is being changed]
User job: [what the user needs to accomplish on this screen]
Audience and context: [who uses it, who approves it, where it will be seen]
Design direction: [density, hierarchy, tone, references, anti-patterns specific to this task]
System constraints: [stack, harness files, components and tokens to reuse, accessibility, mobile]
Output shape: [prototype, component, audit, or plan — and where it lands]
Review criteria: [executable checks to run, plus the human pass/fail questions]

Before building: restate the user job, list the structure, states, and checks, name your
assumptions, and flag anything contradictory or underspecified. Do not build until I approve.

Match the effort to the stakes. The canvas earns its ten minutes when a wrong first draft would cost more than the brief does.

Slide notes

This is the take-home artifact of the module, so spend a moment on how it scales rather than what it says — the lines themselves were covered on the anatomy slide. For a small component each line is one sentence and the whole brief fits on half a page. For a product page, the direction line may need a reference screenshot and the criteria may need to be a list. For a flow, the canvas stays the same and the supporting material moves into the packet from earlier in the module.

Be equally clear about when not to use it. Changing a label, swapping an icon, fixing a spacing bug in a well-harnessed repository — a one-line request is fine, and writing seven canvas lines first is procrastination. The same goes for throwaway explorations you intend to delete: the cost of being wrong is one regeneration, so spend nothing preventing it. The canvas earns its ten minutes when the task has real structure, real states, or a real audience.

The spec-driven-development note is there for participants who work near engineering. GitHub's spec-kit and similar tooling formalise the same spine — intent, constraints, examples, acceptance criteria — at feature scale, with multi-file specs and task breakdowns. The design brief is deliberately the lighter version: borrow the discipline, not the weight.

Narration for this slide

Here is the template you will actually reuse. Seven lines — situation, user job, audience, direction, constraints, output shape, review criteria — and the closing instruction: restate, flag the gaps, do not build until I approve. The skill is matching its weight to the task. For a component, each line is a sentence. For a page, some lines grow examples and screenshots. For a flow, the canvas stays the same and the evidence moves into a packet beside the work. And for a label change or a throwaway exploration, skip it entirely — seven lines of ceremony for a disposable answer is procrastination. The canvas earns its ten minutes when a wrong first draft would cost you more than the brief does.

Slide 11 of 1316:9

Worked example: from a loose request to a brief that held up

The field-notes task from Module 1, this time with the brief itself on the table: what each line did, and what the brief still could not prevent.

  • The loose request — "add a section explaining how field notes get made" — carried the topic but none of the design task
  • The brief's user job line reframed it: a reader deciding whether to subscribe should see that each note comes from a real experiment
  • Direction banned the exact defaults this surface attracts: numbered-circle process graphics, marketing copy, new visual treatments
  • Constraints pointed at DESIGN.md and AGENTS.md and named SectionBand, SectionHeading, SimpleItemCard for reuse
  • The plan restated the job, reused the named components, and flagged the compact-vs-detailed contradiction before any code
  • The brief did not prevent everything: the implementation invented component props, and the type check caught it — not the plan

Writing the brief took ten minutes, and most of that was deciding what the section is for — design work the loose request had simply skipped.

Slide notes

Module 1 used this run as a before-and-after scoreboard; this slide opens the brief itself, because the value is in seeing ordinary sentences do specific work. Walk the lines against the canvas. The situation named the page and where the strip sits, which scoped what the agent read. The user job did the heaviest lifting: it turned explain how field notes get made into a reader deciding whether to subscribe should see that each note comes from a real experiment — which is why the plan came back with three one-line steps instead of a process explainer. Direction was almost entirely bans, chosen because this exact surface attracts this exact slop: numbered circles, gradients, marketing register. Constraints pointed at the harness files rather than restating them, and named the three components to reuse. The criteria mixed two commands with three human questions.

Then give the honest accounting. Ten minutes to write the brief, most of it spent on the user job — the design decision the loose request had skipped. A few minutes reviewing the plan, which flagged a genuine contradiction and cost one sentence to resolve. One fix round after generation, because the implementation passed the section heading props it does not have; the type check caught it immediately. Roughly twenty minutes end to end against roughly forty for the unbriefed run, and these figures describe this run, not a law.

The closing point is the one to leave in the room: the brief moved the corrections earlier; it did not remove the need for the gates.

Narration for this slide

Let's reopen the field-notes run, but this time read the brief itself. The loose request named a topic. The brief's user job line did the real work: a reader deciding whether to subscribe should see that every note starts as a real experiment — that single sentence is why the plan came back with three one-line steps instead of a process explainer. Direction was mostly bans, aimed at exactly the slop this surface attracts. Constraints pointed at the harness and named three components to reuse. The plan restated the job and flagged a real contradiction before any code existed. And the brief still did not prevent everything — the implementation invented component props, and the type check caught it. Ten minutes of brief, twenty minutes end to end. The corrections moved earlier. The gates still earned their keep.

Slide 12 of 1316:9

Exercise: brief the Module 1 task, review it, then run it

Take the task you sketched on paper at the end of Module 1. This time, write the real brief, have the agent review it, and run the loop.

  • Fill in the seven-line canvas for your task; spend the time on the user job and the review criteria
  • Send it with the brief review prompt in plan mode, and read the response against your own sketch from Module 1
  • Expect at least one real gap or contradiction — if the agent finds none, ask it directly what it assumed
  • Approve or revise, let the agent generate into a scratch location, and run your executable criteria
  • Note which corrections the brief prevented, which the gates caught, and which line you would write differently next time

Keep the brief and your notes. Module 4 will ask which lines you would still be writing in your tenth brief — those are your first harness rules.

Slide notes

This exercise closes the loop opened in Module 1, where participants sketched the task on paper without an agent. Now they run it. The sequencing matters: write the canvas first, then the agent review in plan mode, then the comparison against their original sketch. Most people find the agent's review surfaces something their own sketch missed — usually a state or a contradiction between the direction and the constraints — and that moment is worth more than any slide in this module.

Steer the scope again: a section, a component, a bounded audit. The output should land in a scratch folder or branch, not anywhere that needs cleaning up afterwards. If a participant's project has no executable checks at all, have them write the two or three commands they wish existed — that list becomes input to Modules 4 and 6.

The debrief questions are on the last bullet, and the third one is the bridge: which corrections did the brief prevent, which did the gates catch, and which line would you write differently. Collect the differently answers if running this live. They split into lines that were too vague — usually direction — and lines that were missing entirely — usually states or criteria — and both patterns recur so reliably that participants effectively write the next module's motivation for you.

Narration for this slide

Time to run the task you sketched back in Module 1. Write the real brief this time — seven lines, and spend your minutes on the user job and the review criteria, because those are the lines that do the work. Then, in plan mode, send it with the brief review prompt and read what comes back against your original sketch. Expect at least one genuine gap. Approve or revise, let the agent build into a scratch location, run your checks, and look at the result properly — including at mobile width. Then write three notes: what the brief prevented, what the gates caught, and which line you would write differently next time. Keep all of it. Module 4 builds directly on what you just noticed repeating.

Slide 13 of 1316:9

Summary, and what comes next

  • Vague prompts buy generic output because the agent fills every unstated decision with the most common answer it knows
  • The seven-line canvas carries the task: situation, user job, audience, direction, constraints, output shape, review criteria
  • Taste becomes real when adjectives are translated into behaviour — density, hierarchy, restraint, named bans
  • States and review criteria are written before generation; the agent runs the executable checks on itself
  • Briefs stay short because the harness carries everything durable — and repeated brief lines are harness rules in disguise

Module 4 builds the harness: the instruction files, path-scoped rules, skills, and tokens that make briefed quality repeatable across sessions and people.

Slide notes

Recap by connecting the evidence to the practice rather than re-listing the slides. The vague-versus-specific run showed the cost of unstated decisions in numbers — violations, rounds, minutes — and the canvas is simply the list of decisions worth stating. The translation slide and the states slide are the two places where briefs most often stay vague; the critique contract and the brief review are the two places where the brief starts checking itself. The harness boundary keeps the whole thing sustainable.

Be honest one more time about the limits, because the module has earned it: a brief encodes decisions, it does not make them, and even a good brief plus an approved plan did not prevent the invented-props failure — the type check did. The brief moves correction earlier and makes review possible; the gates from Module 2 and the checks coming in Module 6 are still load-bearing.

Preview Module 4 concretely. Everything participants noticed repeating in this module — the token rules, the component names, the recurring bans, the verify commands — is about to get a permanent home: CLAUDE.md and AGENTS.md, path-scoped rules, SKILL.md files, and design tokens treated as agent instructions. The exercise notes they just produced are the raw material for that module's exercise, so ask them to bring the notes along.

Narration for this slide

Let's close. Vague prompts produce generic output because every decision you do not make, the agent makes for you — and we saw what that costs in violations, rounds, and minutes. The seven-line canvas is the fix: situation, user job, audience, direction, constraints, output shape, review criteria, and a plan-first closing line. Taste goes in as behaviour, not adjectives. States and criteria are written before generation, and the agent runs the executable checks on itself. And briefs stay short because the harness carries everything durable. That harness is Module 4: the instruction files, rules, skills, and tokens that make this quality repeatable — not just for you, but for everyone who works in the same project. Bring your exercise notes. See you there.

Module transcript
Module 3, narrated slide by slide

Slide 1Briefing and Design Intent

Welcome back. Module 3 is about the brief — the place where design intent becomes something an agent can actually execute. We are going to look at real evidence of what vague prompts cost, then build the seven-line briefing canvas line by line: what each line carries, and what the agent does with it. We will cover how to brief taste without resorting to adjectives, how to brief states and review criteria, when a research packet earns its place, and what should never be in the brief at all because it belongs in the harness. The honest premise of the whole module: anything you do not decide, the agent decides for you.

Slide 2Vague vs specific: the same task, run both ways

Let's start with the evidence. Same task — a pricing page for a design workflow tool — run twice. The vague prompt, twenty-six words, came back as the genre average: gradient hero, three interchangeable cards, invented prices, and thirty-three hardcoded-colour violations. Three rounds later it was tidier, but the buyer's decision problem was untouched. The specific prompt, eighty-seven words, named the buyers, the decision, the hierarchy, and the banned defaults. The first draft was comparison-first, passed the audit, and needed two rounds. Here is the honest part: neither prompt said what the comparison does on a phone, so neither output handled it. A prompt only buys the decisions you put in it.

Slide 3Anatomy of the briefing canvas

Here is the canvas itself. Seven lines, and the reason each one exists is on the right: what the agent does with it. The situation scopes what it reads. The user job becomes the sentence the plan restates, and it drives hierarchy. Audience sets tone and polish. Direction and references are where your adjectives get translated into behaviour. Constraints point at the harness and name the components to reuse. Output shape says where the work lands — usually a scratch folder, not main. And review criteria are written before generation, so the agent can check itself. Then the closing line: plan first, do not build until I approve. If a line would not change what gets built, it does not belong in the brief.

Slide 4References, constraints, and taste: turning adjectives into behaviour

Now the part where taste meets the keyboard. Adjectives do not survive contact with an agent — clean, modern, premium can each mean five different layouts, so the agent picks the most common one. The translation is always the same move: turn the adjective into behaviour. Premium becomes fewer elements, stronger hierarchy, more negative space. Simple becomes three steps maximum, one line each. On-brand becomes the named components and tokens to reuse. And references need instructions too: match this screenshot's density and rhythm, ignore its colours and copy. The test for every line is double — can the agent act on it, and can a reviewer check it. If the answer to either is no, rewrite it.

Slide 5Brief the states, not just the happy path

Here is where most agent-built interfaces quietly fail: the states. The prompt described the happy path, so the happy path is what got designed. Real products live everywhere else — the empty list, the failed request, the loading wait, the success moment, the phone screen, the copy that runs long. The habit is simple. End every brief for an interactive surface with a short list of states, right next to the review criteria, and require the plan to say how each one is handled. That turns a missing error state from something you discover in review into something the plan visibly skipped. Remember the pricing page: the mobile gap was not the agent's failure. It was a decision nobody had made.

Slide 6The critique contract: review criteria written before generation

Every brief should end with the critique contract: how the work will be judged, written before the work exists. Split it in two. The executable half — the token audit, the type check, the verify script — the agent runs on its own output before it reports done. In our traced runs those checks did real work: the audit caught the hardcoded colours, the type check caught the invented component props that a perfectly sensible plan did not prevent. The human half stays yours: does this read as part of the product, does the copy hold the voice, does the mobile order make sense. Writing the criteria first keeps them honest — criteria written afterwards always bend to fit what came back.

Slide 7Research packets as briefing inputs for bigger tasks

When the task gets bigger than a section, the brief needs backup. That is the research packet: a small folder beside the work — the brief, the user job, the evidence behind the change, screenshots of the current state, the approved plan, and the review criteria as their own file. Two disciplines make it useful. Keep evidence honest: separate what the source actually says from what you are inferring, note where it came from, and mark how confident you are — low confidence should trigger a test, not a layout. And keep it as files, not one long message, so the agent can quote it back and the plan can be reviewed against it. For a single component, skip it. For a flow with real stakes, it earns the folder.

Slide 8What belongs in the brief vs what belongs in the harness

Here is the boundary that keeps briefs readable. The harness carries everything durable — tokens, components, working rules, the bans that apply to every surface — and it loads in every session. The brief carries only what is specific to this task: the user job, the direction for this screen, the criteria, the output shape. The diagnostic is simple: if you have typed the same constraint into your last five briefs, it is not task context, it is a project rule wearing a disguise. Move it to the harness and stop typing it. And if your project has no harness yet, your briefs will run long for now — notice what repeats, because that repetition is exactly what Module 4 will teach you to encode.

Slide 9Brief review prompts: asking the agent to find the gaps

Before the agent builds anything, make it review the brief. The prompt is on the slide: restate the job, list what is clear, what is missing or contradictory, which states are underspecified, the likely failure modes, and the plan — and do not build until I approve. In plan mode this costs nothing, because the agent is already reading the project without write access. In the field-notes run this step caught a real contradiction — a compact strip and detailed step descriptions cannot both be true — and resolving it cost one sentence instead of a revision round. One warning: a review that finds nothing wrong is suspicious. Real briefs have gaps. An agent that sees none is agreeing, not reading.

Slide 10The seven-line canvas as a reusable template

Here is the template you will actually reuse. Seven lines — situation, user job, audience, direction, constraints, output shape, review criteria — and the closing instruction: restate, flag the gaps, do not build until I approve. The skill is matching its weight to the task. For a component, each line is a sentence. For a page, some lines grow examples and screenshots. For a flow, the canvas stays the same and the evidence moves into a packet beside the work. And for a label change or a throwaway exploration, skip it entirely — seven lines of ceremony for a disposable answer is procrastination. The canvas earns its ten minutes when a wrong first draft would cost you more than the brief does.

Slide 11Worked example: from a loose request to a brief that held up

Let's reopen the field-notes run, but this time read the brief itself. The loose request named a topic. The brief's user job line did the real work: a reader deciding whether to subscribe should see that every note starts as a real experiment — that single sentence is why the plan came back with three one-line steps instead of a process explainer. Direction was mostly bans, aimed at exactly the slop this surface attracts. Constraints pointed at the harness and named three components to reuse. The plan restated the job and flagged a real contradiction before any code existed. And the brief still did not prevent everything — the implementation invented component props, and the type check caught it. Ten minutes of brief, twenty minutes end to end. The corrections moved earlier. The gates still earned their keep.

Slide 12Exercise: brief the Module 1 task, review it, then run it

Time to run the task you sketched back in Module 1. Write the real brief this time — seven lines, and spend your minutes on the user job and the review criteria, because those are the lines that do the work. Then, in plan mode, send it with the brief review prompt and read what comes back against your original sketch. Expect at least one genuine gap. Approve or revise, let the agent build into a scratch location, run your checks, and look at the result properly — including at mobile width. Then write three notes: what the brief prevented, what the gates caught, and which line you would write differently next time. Keep all of it. Module 4 builds directly on what you just noticed repeating.

Slide 13Summary, and what comes next

Let's close. Vague prompts produce generic output because every decision you do not make, the agent makes for you — and we saw what that costs in violations, rounds, and minutes. The seven-line canvas is the fix: situation, user job, audience, direction, constraints, output shape, review criteria, and a plan-first closing line. Taste goes in as behaviour, not adjectives. States and criteria are written before generation, and the agent runs the executable checks on itself. And briefs stay short because the harness carries everything durable. That harness is Module 4: the instruction files, rules, skills, and tokens that make this quality repeatable — not just for you, but for everyone who works in the same project. Bring your exercise notes. See you there.