AAgentic Design School
Module 4 of 7
45–55 minutes

Claude Code for Designers

Prototyping and Building

Moving from single sections to working prototypes: multi-screen flows, realistic data, interaction states, and the discipline of keeping prototypes honest about what they are.

Duration45–55 minutes

Slides12 slides with notes and narration

Learning objectives

  • Plan a prototype as a set of bounded agent runs rather than one giant prompt.
  • Get realistic states and data shapes into prototypes without hand-faking everything.
  • Decide what is prototype-quality and what would need production hardening.
  • Share prototypes for review without implying they are shippable.
Slide deck

Work through the module

Each slide is shown in its 16:9 frame, exactly as it appears in the video version. Open the notes under any slide for the longer explanation, and the narration if you prefer to read along.

Slide 1 of 1216:9

Prototype First, Production Later

Claude Code for Designers · Module 4 of 7

  • Scoping a prototype: flows, screens, and the states that matter
  • Decomposing the work into bounded agent runs
  • Realistic data, interaction, and navigation without a backend
  • Keeping prototypes honest — and keeping them out of production

A prototype exists to answer a question. Its value is the answer, not the artifact — and this module is about getting that answer fast without lying to anyone about what was built.

Slide notes

Module 3 ended with a single section converted from a mockup. This module scales that skill up to the thing designers actually want: a working, clickable prototype with multiple screens, realistic data, and real interaction — built in a working session rather than a sprint of engineering time.

The framing to set up front is the title: prototype first, production later. Agents make it cheap to produce something that looks finished, and that is exactly the danger. A polished prototype can create false confidence — it may skip edge cases, accessibility, real data behaviour, permissions, and error handling, and nobody can tell by looking at it. The discipline this module teaches is as much about labelling and boundaries as it is about building.

Set expectations on what gets built during the module: the worked example is a three-screen flow run in one session, and the exercise asks each participant to scope and run the first screen of a prototype of their own. Participants should arrive with Module 2's setup done and Module 3's review habits fresh, because both get used continuously here.

Narration for this slide

Welcome to Module 4. In Module 3 you converted a single section from a mockup into working code. Now we scale that up to the thing you actually wanted all along: a real, clickable prototype — multiple screens, realistic data, interaction you can feel — built in one working session. The title of this module is also its rule: prototype first, production later. Agents make it cheap to build something that looks finished, and that is both the opportunity and the trap. So this module is half about building fast, and half about staying honest about what you built. Let's start with scoping.

Slide 2 of 1216:9

Scoping: the question, the flow, the states that matter

A prototype scope is not a feature list. It is the smallest build that lets someone encounter your riskiest assumption.

  • Name the question the prototype exists to answer, in one sentence
  • Name the riskiest assumption — the thing that, if wrong, changes the design
  • Define the happy path: entry point, three to six steps, end state
  • List what is explicitly out of scope: auth, settings, error handling, real backend
  • Decide which states matter for the question — and only those

The prototype is done when someone can encounter the assumption and react to it, not when the screens look finished.

Slide notes

The most common way prototype work fails with an agent is the same way it fails without one: prototyping the whole feature. The agent will happily build everything you ask for, which makes over-scoping cheaper to start and more expensive to finish — by the third screen of an unscoped build, nobody remembers what the prototype was supposed to prove.

The scoping discipline comes from the school's interactive prototype sprint workflow and the prototype-first article: write down the design question, then the riskiest assumption behind it. If you are testing a new onboarding flow, the question might be whether choosing a role first makes the next step clearer; the assumption is that role-first ordering reduces confusion. The happy path is the single walk a reviewer or test participant takes through that assumption — entry point, a handful of steps, an end state. Everything else is stubbed or skipped unless it is the question.

The out-of-scope list is not an afterthought; it is half the brief. Authentication, settings screens, error handling beyond what the flow needs, responsive breakpoints nobody will look at, and any real backend all go on it by default. Writing the assumption down also gives the session its stop condition, which matters when an agent can keep producing more screens for as long as you keep asking.

Narration for this slide

Scoping is where prototypes are won or lost, and agents make over-scoping easier, not harder. So before any prompt, write three things down. First, the question this prototype exists to answer — one sentence. Second, the riskiest assumption: the belief that, if wrong, changes the design. Third, the happy path: the entry point, three to six steps, and the end state a reviewer walks through to encounter that assumption. Then write the out-of-scope list — auth, settings, error handling, the real backend — because the agent will build whatever you do not exclude. The prototype is done when someone can hit the assumption and react. Not when it looks finished.

Slide 3 of 1216:9

Decomposing into runs: one screen or flow per session

A prototype is several bounded agent runs, not one giant prompt. The boundaries are what keep quality consistent across screens.

  • One screen or one flow per session — then clear the context and start the next
  • The kitchen-sink prompt fails: by screen three, the agent has lost the design system from screen one
  • Lock decisions that must persist — tokens, spacing, component choices — in CLAUDE.md, not in the conversation
  • After two failed corrections on the same point, clear and re-brief instead of arguing
  • Each run starts from the brief plus the locked decisions, so screen four matches screen one

Context is a budget. Decisions you want to survive the session go in the instruction file; everything else can be cleared.

Slide notes

This slide carries the central working pattern of the module, and it is grounded in the official guidance reflected in the book's research: long sessions accumulate failed attempts and drift. The anti-pattern is the kitchen-sink prompt — build the homepage, the dashboard, the settings, and the profile page — which fills the context window so that by the third page the agent has effectively forgotten the design decisions it made on the first.

The fix has two halves. The first is decomposition: one screen or one flow per run, with the context cleared between runs. The second is persistence: any decision that must survive across runs — the palette, the type scale, the spacing unit, which components to use — gets written into the project's CLAUDE.md (or DESIGN.md, depending on how the project is set up from Module 2) rather than living only in the conversation. Once a screen comes out right, asking the agent to record those decisions in the instruction file is a one-line request, and every later run starts from them.

The two-corrections rule is worth stating explicitly because designers tend to keep negotiating: if the same problem survives two correction rounds, stop, clear the session, and write a better brief that incorporates what you learned — for example, last time the spacing was too tight, use 32px section gaps minimum. A clean session with a sharper brief almost always beats a long session full of accumulated corrections.

Narration for this slide

Here is the working pattern that makes multi-screen prototypes hold together: one screen or one flow per session. The kitchen-sink prompt — build me the whole app — fails in a predictable way: the context fills up, and by screen three the agent has forgotten the design system it used on screen one. So decompose. Run one screen, review it, and when it is right, lock the decisions that need to persist — tokens, spacing, component choices — into the project's instruction file. Then clear the session and start the next screen from that file. And if a correction fails twice, stop arguing. Clear, and write a better brief with what you just learned.

Slide 4 of 1216:9

The one-session prototype arc

From brief to clickable artifact in a working session, with checkpoints where the designer looks before the agent builds more.

Six-step diagram of a single prototyping session. Step one, the designer briefs the screen: the question, the riskiest assumption, the happy path, and the out-of-scope list. Step two, the agent builds the happy path from existing tokens, components, and sample data inside a marked prototype folder. Step three is a checkpoint where the designer screenshots the result, walks the flow, and either fixes and loops or clears and re-briefs, shown with a dashed feedback line back to the build step. Step four, the agent adds states and the next screen with realistic data and hard-coded navigation. Step five is a blocker-only QA pass that fixes dead ends and logs polish without fixing it. Step six is the session output: a clickable artifact behind an index page, labelled as a prototype and not promoted to production.
Brief, the checkpoint, and the QA pass are designer-led; the build steps are agent-run. The dashed line is the loop: fix and rebuild until the flow holds, and after two failed corrections, clear the session and re-brief.

The checkpoints are not overhead. They are where the next run's brief gets sharper and where drift gets caught while it is still one screen wide.

Slide notes

Walk the diagram left to right along the top row, then back along the bottom. The brief is the scoping work from the previous two slides condensed into one document the agent reads: question, assumption, happy path, out-of-scope list, and what may be fake versus what must be realistic. The build step is agent-run and deliberately constrained — existing tokens, existing components, sample data, all inside a clearly marked prototype folder rather than next to production files.

The checkpoint deserves the most discussion time because it is the step new users skip. Take a screenshot at the width the review or test session will actually use, walk the flow yourself end to end, and only then decide: fix and loop, or — if the same issue has survived two corrections — clear the session and re-brief. This is the single-section review habit from Module 3 applied at flow scale.

The second row is where the prototype becomes useful: states and the next screen get added run by run, a blocker-only QA pass protects the eventual review without polishing anything, and the session ends with a clickable artifact behind a small index page, labelled as a prototype. The timings from the school's traced runs are realistic to quote: a happy path holding after two build-loop passes in roughly ninety minutes, and a three-screen flow with variants inside a half day — provided the harness of tokens and components already exists.

Narration for this slide

Here is the whole session on one diagram. You write the brief — question, assumption, happy path, out-of-scope. The agent builds the happy path from your existing tokens and components, inside a marked prototype folder. Then the checkpoint: screenshot it at the width your review will use, walk it yourself, and either fix and loop or clear and re-brief. Once the flow holds, the agent adds states and the next screen, run by run. A blocker-only QA pass fixes dead ends and logs the polish without touching it. And the session ends with a clickable artifact behind an index page — labelled a prototype, because that is what it is. The checkpoints are the design work. Everything else is assembly.

Slide 5 of 1216:9

Realistic data: shapes, edge lengths, and empty states

Lorem ipsum hides design problems. Realistic sample data is the cheapest honesty a prototype can have — and the agent can generate it for you.

  • Ask for sample data shaped like production: real field names, plausible values, realistic counts
  • Include the edge lengths: the longest customer name, the 47-item list, the zero-item list
  • Empty, loading, and error placeholders where they affect the flow — stubs are fine, absence is not
  • Keep sample data in its own file, clearly labelled as fake
  • Hard-code what the review will not exercise; realistic sample data beats a real backend

If the prototype only ever shows the medium-length, five-item, everything-loaded case, it is testing a layout, not a design.

Slide notes

Designers already know that lorem ipsum and three tidy cards hide problems; the difference with an agent is that realistic data stops being tedious to produce. Asking for a sample dataset shaped like production — the same field names, plausible values, realistic record counts — is a one-paragraph instruction, and the agent will generate names that overflow, dates in the right format, and prices with the right number of digits. Specify the shape rather than accepting whatever the agent invents, because invented data trends towards convenient.

Edge lengths are the high-value request: the longest plausible name, the empty list, the list with dozens of items, the description that wraps to four lines. These are the cases that break layouts in production and that static mockups quietly omit. The prototype-first article's rule applies here: fake data is allowed, but label it clearly — keep it in its own file, named so that nobody later mistakes it for a real integration.

The boundary to hold is between realistic and real. The prototype does not need a backend, an API, or permissions; it needs data that behaves like the real thing for the duration of one walkthrough. Loading, empty, and error states only need to exist where they affect the flow being tested — a plain placeholder is enough, but their complete absence means the review will react to a fiction.

Narration for this slide

Now, data. Lorem ipsum and three tidy cards hide design problems — you know this. What changes with an agent is that realistic data stops being expensive. Ask for sample data shaped like production: real field names, plausible values, realistic counts. Then ask for the edges — the longest customer name, the list with forty-seven items, the list with none. Ask for empty, loading, and error placeholders where they affect the flow; a stub is fine, absence is not. Keep it all in a clearly labelled sample data file, because fake data is allowed but hidden fake data is how prototypes start lying. You do not need a backend. You need data honest enough to test against.

Slide 6 of 1216:9

Interaction and navigation without a backend

A prototype needs to behave, not to be wired. Hard-coded behaviour that feels real is the point, not a shortcut to apologise for.

  • Wire navigation between screens with hard-coded data passing — no API, no database
  • Build the interactions the question depends on: typing that updates a value, dragging that redraws, states that respond
  • Hover, focus, and pressed states on anything a reviewer will touch — they are part of the design
  • Stub everything else with a plain, visibly unfinished placeholder
  • Verify in a real browser: open it, click through it, resize it — or have the agent screenshot each step

Code prototypes earn their keep on questions that depend on behaviour — the price updating as you type, the chart redrawing as you drag — which static mockups can only fake.

Slide notes

The reason to prototype in code at all, rather than in a clickable mockup, is behaviour: anything where the design question depends on real interaction — typing into a configurator and watching the price respond, dragging a date range and seeing a chart redraw, feeling how navigation behaves with real content depth. Those are the questions worth this module's effort, and they are exactly what design tools simulate worst.

The technique is to be explicit with the agent about the difference between behaving and being wired. Navigation between the prototype's screens should work, with state and data passed in the simplest hard-coded way that survives the walkthrough. The interactions the question depends on get built properly. Everything else gets a plain placeholder that looks deliberately unfinished — visible seams in places the review never touches are a feature, because they keep the prototype honest about what it is.

Interaction states are the detail designers care about and agents drop by default unless asked: hover, focus, and pressed states on anything a reviewer will actually touch. The Module 3 lesson repeats here — accessibility and states you have not explicitly asked for and checked will not be there. Verification stays visual: open the prototype in a browser, click through the whole flow, resize the window. If the project is set up with browser tooling from Module 2, the agent can drive the page, take screenshots of each step, and report what a participant would see — but the designer still walks it personally before anyone else does.

Narration for this slide

What makes a code prototype worth building is behaviour. The price that updates as you type. The chart that redraws as you drag. Navigation that works with real content depth. So tell the agent exactly which behaviours the question depends on, and have it build those properly — and have it wire navigation between screens with hard-coded data, no backend, no API. Everything else gets a plain placeholder that is visibly unfinished, and that is fine; visible seams keep the prototype honest. Ask for hover and focus states on anything a reviewer will touch, because the agent will not add them unprompted. Then open it in a browser and click through every step yourself.

Slide 7 of 1216:9

The prototype-to-production line

The same interface can be excellent evidence and a bad production base. What is honest to claim depends on which kind of artifact it is.

ArtifactUseful forNot safe to claim
Sketch prototypeLayout, story, directionAnything about real data or behaviour
Interactive prototypeFlow, timing, reaction to the riskiest assumptionThat states, data, and access control are real
Code spikeTechnical feasibility of one mechanismThat it fits the product's architecture
Production candidateShipping, after reviewNothing — it must pass real components, states, tests, accessibility, and review gates

Unsafe does not mean useless. A prototype can be excellent evidence and still be a bad production base — the failure is forgetting which assumptions were fake.

Slide notes

This table is the prototype readiness matrix from the school's prototype-first article, and it is the vocabulary to give stakeholders before they ever see a prototype. The four rows are different contracts: a sketch prototype tests layout and story; an interactive prototype tests flow and reaction, with states that may be fake; a code spike tests whether one mechanism is feasible at all, and may ignore the architecture entirely; a production candidate is the only row that claims to be shippable, and it earns that claim by using real components, real states, tests, accessibility checks, and review gates.

The practical use of the matrix is in the brief: tell the agent which kind of artifact it is building, because the goals differ. A prototype maximises learning; production code maximises maintainability, correctness, accessibility, and integration. Without that contract, the agent optimises for the wrong thing — usually for looking more finished than the work justifies.

The risks designers should be able to name out loud: fake data is fine for flow and unsafe for validating real behaviour; hard-coded state is fine for a walkthrough and unsafe as evidence about real users; local components built for speed become dangerous if they quietly duplicate the design system; skipped accessibility can be acceptable in an exploration but must block any promotion to production. Each of these is acceptable only when it is named before anyone makes a decision based on the prototype.

Narration for this slide

Here is the line that keeps everyone honest. The same interface can be four different things. A sketch prototype is good for layout and story. An interactive prototype is good for flow and for testing your riskiest assumption — but its states and data may be fake. A code spike proves one mechanism is feasible and says nothing about architecture. Only a production candidate claims to be shippable, and it has to earn that with real components, real states, tests, and accessibility. The point of the table is what is honest to claim. A prototype can be excellent evidence and still be a bad production base. The failure is not the fakeness — it is forgetting which parts were fake.

Slide 8 of 1216:9

Reviewing with stakeholders: what to show, what to caveat

A prototype review should put the question in front of people, not the polish. The caveats are part of the presentation, not an apology at the end.

  • Open with the question and the riskiest assumption — what this prototype exists to find out
  • Walk the happy path live, or let stakeholders click through it themselves
  • Say plainly what is fake: the data, the states, the integrations, the screens that are stubs
  • Capture reactions against the assumption, not against the visual polish
  • Close with what happens next: what gets kept as a pattern, what gets rebuilt, what gets discarded
  • Never leave the room with anyone believing they saw a nearly finished product

The most persuasive move in a prototype review is letting a stakeholder drive it themselves — and the most dangerous, if nobody has said what is fake.

Slide notes

Prototypes built with an agent look better than prototypes used to look, and that raises the stakes of the review. The strongest pattern from the school's case studies is also the riskiest: handing the laptop across the table and letting stakeholders or clients walk the flow themselves, with their own categories and realistic data. It is far more persuasive than a slide of static screens — and precisely because it is persuasive, the caveats have to be delivered before the walkthrough, not after.

Structure the review around the learning goal. Open with the question and the assumption, so reactions land on the decision rather than the colour palette. State plainly what is fake: this data is sample data, these two screens are stubs, there is no real backend, accessibility has not been done. Capture what people say about the assumption — that is the evidence the prototype was built to produce.

Close with the promotion language from the next slide so expectations are set in the same meeting: which patterns the team intends to keep, what would be rebuilt properly, and what gets discarded. The failure mode to avoid is a stakeholder leaving the room believing they saw a nearly finished product; that single misunderstanding converts a half-day prototype into a delivery commitment nobody scoped.

Narration for this slide

Sharing a prototype is its own skill, because agent-built prototypes look better than prototypes used to look. So structure the review. Open with the question and the riskiest assumption — tell people what this thing exists to find out. Then let them walk the happy path, ideally hands on the keyboard themselves; nothing is more persuasive. But before that, say plainly what is fake: the data, the stubbed screens, the missing backend, the accessibility you have not done. Capture reactions against the assumption, not the polish. And close with what happens next — keep, rebuild, discard — so nobody leaves the room believing they saw a product that is nearly done.

Slide 9 of 1216:9

Keeping prototypes from quietly becoming production

Prototype code becomes dangerous when it loses its boundary. Give it a place to live, a label, and a promotion gate it cannot skip.

  • Prototypes live in a clearly marked folder or route, never beside production files with similar names
  • The brief, learning goals, screenshots, and review notes stay next to the code
  • Promotion is a human review: keep the validated patterns, rebuild them in product architecture, discard the rest
  • The agent can prepare the evidence for promotion; it does not get to promote its own prototype
A prototype-safe corner of the project
src/app/
├── onboarding/                  # production — agent told not to touch
└── prototypes/
    └── onboarding-v2/
        ├── page.tsx             # the prototype flow
        ├── sample-data.ts       # fake data, labelled as fake
        └── prototype-notes.md   # brief, learning goals, review notes

The cheapest guardrail is physical: a prototypes folder the team recognises, and a standing rule that nothing ships from it.

Slide notes

This is where temporary assumptions become permanent debt if nobody holds the line. The mechanism is mundane: prototype files sitting next to production files with similar names get treated as product architecture by the next person — or the next agent — to open the project. The fix is equally mundane: a physical boundary. A prototypes route or directory, named so its status is unmistakable, with the brief, the learning goals, the screenshots, and the review notes kept beside the code so future readers can see what was learned and what was never meant to ship. The brief should also tell the agent explicitly which production paths are off limits.

Promotion is the second guardrail. Before any prototype influences production work, run a promotion review: which decisions did the prototype validate, which parts are worth keeping as design patterns, which parts must be rebuilt properly inside the product's architecture, which states and accessibility work are missing, and which assumptions were fake. The agent is genuinely useful here — it can produce the component map, the missing-states list, and a rebuild plan — but the keep-or-discard decision belongs to the humans, because the agent has no incentive to recommend throwing away its own work.

Be honest that the throwaway rule is organisationally hard. After a good review, someone with authority has to say the prototype still gets archived. The moment prototype code starts shipping, every future prototype inherits production caution, and the half-day sprint stops being half a day.

Narration for this slide

Now the part teams get wrong: stopping the prototype from quietly becoming production. The first guardrail is physical. Prototypes live in their own clearly marked folder — a prototypes route, an obvious name — never next to production files where the next person, or the next agent, mistakes them for product architecture. Keep the brief, the learning goals, and the review notes right there with the code. The second guardrail is the promotion review: keep the validated patterns, rebuild them properly in the product's architecture, discard the rest. The agent can prepare that evidence, but it does not promote its own prototype. And someone senior has to be willing to archive a prototype that tested well. That is the whole discipline.

Slide 10 of 1216:9

Worked example: a three-screen onboarding flow in one session

The onboarding case from the school's prototype material, run as three bounded builds with checkpoints between them.

RunWhat was askedWhat the checkpoint caught
Brief (~15 min)Question: does role-first ordering make the next step clearer? Happy path: role → template choice → first dashboard. Out of scope: invites, auth, real dataA contradiction — the brief asked for both role-first and import-first; resolved before any build
Run 1: role selectionBuild screen one in /prototypes/onboarding-v2 from existing components and tokensRole cards used invented colours; pointed back at tokens, fixed in one round
Run 2: template choiceScreen two plus navigation from screen one; sample data with realistic template names and countsLong template names overflowed the cards — visible only because the data had edge lengths
Run 3: dashboard + statesScreen three, plus empty and loading placeholders; locked decisions recorded in CLAUDE.md after run 1One dead-end click in the QA walk; polish issues logged, not fixed

Three briefs, three checkpoints, roughly half a day — and a clickable flow behind an index page that everyone in the review knew was a prototype.

Slide notes

This traces the onboarding scenario used in the school's prototype-first article through the session arc from the diagram slide, so the steps should feel familiar rather than new. The brief stage did real work before any code existed: writing the question and the happy path surfaced a contradiction — the team wanted to test role-first ordering but the draft brief also asked for an import-first entry — and resolving that on paper cost minutes rather than a rebuilt screen.

The three runs show the decomposition pattern in practice. Run one built only the role-selection screen, and its checkpoint caught the most common drift: invented colours instead of the project's tokens, fixed by pointing the agent back at the token file and then locking that decision into CLAUDE.md so runs two and three started from it. Run two added the second screen and the navigation between them, and the realistic sample data earned its keep immediately — long template names overflowed the cards, a problem a tidy mockup would never have shown. Run three added the dashboard and the empty and loading placeholders, and the blocker-only QA walk found one dead-end click that would have derailed a stakeholder review.

Keep the claims modest and bounded: this is one traced run of a small flow, with a harness of existing components already in place; roughly half a day is a realistic figure for that situation, not a promise. The pattern — bounded runs, checkpoints, locked decisions, honest labelling — is the part that transfers.

Narration for this slide

Let's trace a real-shaped example: a three-screen onboarding flow — role selection, template choice, first dashboard. The brief alone earned its keep: writing the happy path down exposed a contradiction in what the team wanted to test, fixed before any code existed. Run one built just the role screen; the checkpoint caught invented colours, we pointed the agent back at the tokens, and locked that decision into the instruction file. Run two added template choice and navigation — and the realistic data immediately showed long names overflowing the cards. Run three added the dashboard, empty and loading states, and the QA walk caught one dead-end click. Half a day. Three briefs, three checkpoints, one honest, clickable prototype.

Slide 11 of 1216:9

Exercise: scope and run the first screen of your own prototype

Take a flow from your current work and run the first screen through the session arc. One screen only — the rest of the flow is next session's work.

  • Write the brief: the question, the riskiest assumption, the happy path, and the out-of-scope list
  • Tell the agent where the prototype lives and which production paths are off limits
  • Ask for realistic sample data with at least one edge case, in its own labelled file
  • Build the first screen, then run the checkpoint: screenshot, walk it, fix or re-brief
  • Record the decisions worth keeping in CLAUDE.md, and note what you would brief differently for screen two

Stop after one screen even if it goes well. The discipline of stopping at the boundary is the skill being practised.

Slide notes

The exercise deliberately covers the first run of the arc rather than a whole prototype, because the boundary is the skill: most participants can get an agent to produce more screens, and far fewer can stop at one, review it properly, and bank the decisions before continuing. Steer people towards a flow from their real work — the same project they set up in Module 2 and converted a section of in Module 3 — but if that project is sensitive or unavailable, a fictional product is fine; the brief and the checkpoint behave the same way.

Watch for the common stumbles. Briefs that name a feature but not a question produce style exercises — push for the decision the prototype should inform. Out-of-scope lists that are empty mean the participant has not really scoped. Sample data without an edge case will sail through the checkpoint and teach nothing. And the most frequent miss: skipping the CLAUDE.md step at the end, which is precisely the step that makes screen two cheaper than screen one.

If running this live, have participants swap screenshots and briefs in pairs and ask one question of each other: could you run screen two from this brief plus the locked decisions, without asking the author anything? That question tests whether the decomposition actually happened.

Narration for this slide

Your turn. Pick a flow from your real work and run only its first screen through the arc. Write the brief — the question, the riskiest assumption, the happy path, the out-of-scope list. Tell the agent where the prototype lives and what it must not touch. Ask for realistic sample data with at least one edge case, in its own labelled file. Build the screen, then do the checkpoint properly: screenshot it, walk it, fix it or re-brief. Finish by locking the decisions worth keeping into CLAUDE.md, and write one note about what you would brief differently for screen two. Then stop. Stopping at the boundary is the skill.

Slide 12 of 1216:9

Summary, and the bridge to design systems

  • Scope to the question and the riskiest assumption; the happy path and the out-of-scope list are the brief
  • Decompose into bounded runs — one screen or flow per session, with decisions locked in CLAUDE.md between them
  • Realistic data, edge lengths, and the states that matter are cheap to ask for and expensive to skip
  • The readiness matrix sets what is honest to claim; the caveats are part of the stakeholder review, not an apology
  • Prototypes live in a marked folder, get promoted only through human review, and are archived without guilt

Module 5 turns to the system underneath all of this: building, documenting, and auditing the component library the prototypes were assembled from.

Slide notes

Recap by connecting the bullets into one arc: the scope defines the question, the decomposition keeps quality consistent across screens, the data and states keep the prototype honest about behaviour, the readiness matrix and review habits keep it honest with stakeholders, and the folder boundary plus promotion review keep it out of production. Each piece on its own is small; together they are why a designer can move at prototype speed without creating the confusing pile of half-production code that gives prototyping a bad name.

It is worth naming what made the speed possible in the worked example: the prototype was assembled from components and tokens that already existed. That dependency is the bridge to Module 5. When the system is healthy — components documented, tokens enforced, drift caught — every prototype starts from a running start; when it is not, every prototype re-invents the parts and the drift compounds.

Module 5 points the agent at the system itself: building new components inside the system's rules, generating documentation in the same run rather than after it, and running audits that find token violations and drift with evidence. Participants who did this module's exercise should keep their prototype folder; the audit exercise in Module 5 can be pointed at the very components their prototype used.

Narration for this slide

Let's close. A prototype starts with a question and its riskiest assumption — that scope is the brief. You build it as bounded runs, one screen per session, locking decisions into the instruction file between them. Realistic data and real states keep it honest about behaviour; the readiness matrix and a well-run review keep it honest with stakeholders; and a marked folder plus a human promotion gate keep it out of production. Notice what made the speed possible, though: existing components and tokens to assemble from. That system is the highest-leverage place to point the agent next — building, documenting, and auditing the component library itself. That is Module 5. See you there.

Module transcript
Module 4, narrated slide by slide

Slide 1Prototype First, Production Later

Welcome to Module 4. In Module 3 you converted a single section from a mockup into working code. Now we scale that up to the thing you actually wanted all along: a real, clickable prototype — multiple screens, realistic data, interaction you can feel — built in one working session. The title of this module is also its rule: prototype first, production later. Agents make it cheap to build something that looks finished, and that is both the opportunity and the trap. So this module is half about building fast, and half about staying honest about what you built. Let's start with scoping.

Slide 2Scoping: the question, the flow, the states that matter

Scoping is where prototypes are won or lost, and agents make over-scoping easier, not harder. So before any prompt, write three things down. First, the question this prototype exists to answer — one sentence. Second, the riskiest assumption: the belief that, if wrong, changes the design. Third, the happy path: the entry point, three to six steps, and the end state a reviewer walks through to encounter that assumption. Then write the out-of-scope list — auth, settings, error handling, the real backend — because the agent will build whatever you do not exclude. The prototype is done when someone can hit the assumption and react. Not when it looks finished.

Slide 3Decomposing into runs: one screen or flow per session

Here is the working pattern that makes multi-screen prototypes hold together: one screen or one flow per session. The kitchen-sink prompt — build me the whole app — fails in a predictable way: the context fills up, and by screen three the agent has forgotten the design system it used on screen one. So decompose. Run one screen, review it, and when it is right, lock the decisions that need to persist — tokens, spacing, component choices — into the project's instruction file. Then clear the session and start the next screen from that file. And if a correction fails twice, stop arguing. Clear, and write a better brief with what you just learned.

Slide 4The one-session prototype arc

Here is the whole session on one diagram. You write the brief — question, assumption, happy path, out-of-scope. The agent builds the happy path from your existing tokens and components, inside a marked prototype folder. Then the checkpoint: screenshot it at the width your review will use, walk it yourself, and either fix and loop or clear and re-brief. Once the flow holds, the agent adds states and the next screen, run by run. A blocker-only QA pass fixes dead ends and logs the polish without touching it. And the session ends with a clickable artifact behind an index page — labelled a prototype, because that is what it is. The checkpoints are the design work. Everything else is assembly.

Slide 5Realistic data: shapes, edge lengths, and empty states

Now, data. Lorem ipsum and three tidy cards hide design problems — you know this. What changes with an agent is that realistic data stops being expensive. Ask for sample data shaped like production: real field names, plausible values, realistic counts. Then ask for the edges — the longest customer name, the list with forty-seven items, the list with none. Ask for empty, loading, and error placeholders where they affect the flow; a stub is fine, absence is not. Keep it all in a clearly labelled sample data file, because fake data is allowed but hidden fake data is how prototypes start lying. You do not need a backend. You need data honest enough to test against.

Slide 6Interaction and navigation without a backend

What makes a code prototype worth building is behaviour. The price that updates as you type. The chart that redraws as you drag. Navigation that works with real content depth. So tell the agent exactly which behaviours the question depends on, and have it build those properly — and have it wire navigation between screens with hard-coded data, no backend, no API. Everything else gets a plain placeholder that is visibly unfinished, and that is fine; visible seams keep the prototype honest. Ask for hover and focus states on anything a reviewer will touch, because the agent will not add them unprompted. Then open it in a browser and click through every step yourself.

Slide 7The prototype-to-production line

Here is the line that keeps everyone honest. The same interface can be four different things. A sketch prototype is good for layout and story. An interactive prototype is good for flow and for testing your riskiest assumption — but its states and data may be fake. A code spike proves one mechanism is feasible and says nothing about architecture. Only a production candidate claims to be shippable, and it has to earn that with real components, real states, tests, and accessibility. The point of the table is what is honest to claim. A prototype can be excellent evidence and still be a bad production base. The failure is not the fakeness — it is forgetting which parts were fake.

Slide 8Reviewing with stakeholders: what to show, what to caveat

Sharing a prototype is its own skill, because agent-built prototypes look better than prototypes used to look. So structure the review. Open with the question and the riskiest assumption — tell people what this thing exists to find out. Then let them walk the happy path, ideally hands on the keyboard themselves; nothing is more persuasive. But before that, say plainly what is fake: the data, the stubbed screens, the missing backend, the accessibility you have not done. Capture reactions against the assumption, not the polish. And close with what happens next — keep, rebuild, discard — so nobody leaves the room believing they saw a product that is nearly done.

Slide 9Keeping prototypes from quietly becoming production

Now the part teams get wrong: stopping the prototype from quietly becoming production. The first guardrail is physical. Prototypes live in their own clearly marked folder — a prototypes route, an obvious name — never next to production files where the next person, or the next agent, mistakes them for product architecture. Keep the brief, the learning goals, and the review notes right there with the code. The second guardrail is the promotion review: keep the validated patterns, rebuild them properly in the product's architecture, discard the rest. The agent can prepare that evidence, but it does not promote its own prototype. And someone senior has to be willing to archive a prototype that tested well. That is the whole discipline.

Slide 10Worked example: a three-screen onboarding flow in one session

Let's trace a real-shaped example: a three-screen onboarding flow — role selection, template choice, first dashboard. The brief alone earned its keep: writing the happy path down exposed a contradiction in what the team wanted to test, fixed before any code existed. Run one built just the role screen; the checkpoint caught invented colours, we pointed the agent back at the tokens, and locked that decision into the instruction file. Run two added template choice and navigation — and the realistic data immediately showed long names overflowing the cards. Run three added the dashboard, empty and loading states, and the QA walk caught one dead-end click. Half a day. Three briefs, three checkpoints, one honest, clickable prototype.

Slide 11Exercise: scope and run the first screen of your own prototype

Your turn. Pick a flow from your real work and run only its first screen through the arc. Write the brief — the question, the riskiest assumption, the happy path, the out-of-scope list. Tell the agent where the prototype lives and what it must not touch. Ask for realistic sample data with at least one edge case, in its own labelled file. Build the screen, then do the checkpoint properly: screenshot it, walk it, fix it or re-brief. Finish by locking the decisions worth keeping into CLAUDE.md, and write one note about what you would brief differently for screen two. Then stop. Stopping at the boundary is the skill.

Slide 12Summary, and the bridge to design systems

Let's close. A prototype starts with a question and its riskiest assumption — that scope is the brief. You build it as bounded runs, one screen per session, locking decisions into the instruction file between them. Realistic data and real states keep it honest about behaviour; the readiness matrix and a well-run review keep it honest with stakeholders; and a marked folder plus a human promotion gate keep it out of production. Notice what made the speed possible, though: existing components and tokens to assemble from. That system is the highest-leverage place to point the agent next — building, documenting, and auditing the component library itself. That is Module 5. See you there.