AAgentic Design School
Module 6 of 6
45–55 minutes

Agentic Prototyping

Interactive Prototype Sprint and Handoff

Putting the course together in a timed sprint: from brief to a tested interactive prototype, then a handoff that records what was built, what was faked, what was learned, and what production would actually require.

Duration45–55 minutes

Slides13 slides with notes and narration

Learning objectives

  • Plan and run a prototype sprint as a sequence of bounded agent runs with gates.
  • Test the prototype with users or stakeholders and capture findings against the brief.
  • Write a handoff that distinguishes built, faked, and unknown.
Slide deck

Work through the module

Each slide is shown in its 16:9 frame, exactly as it appears in the video version. Open the notes under any slide for the longer explanation, and the narration if you prefer to read along.

Slide 1 of 1316:9

Interactive Prototype Sprint and Handoff

Agentic Prototyping · Module 6 of 6

  • The sprint is the loop run at full speed
  • A day-by-day shape for one week, with gates
  • Testing the prototype against the riskiest assumption
  • The honest handoff: built, faked, unknown, production cost

Everything from the previous five modules — scoping, briefs, directions, parity, QA — runs once here, end to end, on a clock.

Slide notes

This is the capstone module, so the framing is integration rather than new technique. Modules 1 to 5 each taught one part of the loop in isolation: scoping fidelity, turning references into briefs, exploring directions, holding parity, and running visual QA. The sprint is all of those run in sequence under a deadline, with the deadline doing useful work — it forces the scope conversation that an open-ended prototype never has.

The second half of the module is the part teams skip most often: the handoff. A prototype that tests well and then disappears into a folder teaches nobody anything, and a prototype that tests well and gets quietly promoted to production is worse. The handoff is the artifact that makes the sprint worth more than the demo moment, and writing it honestly — built, faked, unknown — is a design skill in its own right.

Set the expectation that the worked example uses real timings from a traced sprint, not idealised ones, and that the exercise at the end asks participants to plan a sprint for one feature on one page. If they did the exercises in earlier modules, the brief and QA matrix they already wrote feed directly into that plan.

Narration for this slide

Welcome to the final module of Agentic Prototyping. Everything you have built so far — the fidelity decisions, the briefs, the direction exploration, the parity loops, the QA matrix — comes together here in a timed sprint. We will walk the shape of a one-week sprint day by day, look at where the gates sit, test the prototype against the assumption it exists to check, and then do the part most teams skip: the handoff. Not a victory lap, but an honest record of what was built, what was faked, what we learned, and what production would actually cost. Let's start with what a sprint really is.

Slide 2 of 1316:9

The sprint is the loop run at full speed

Nothing in the sprint is new. It is the brief–build–critique–revise loop with a calendar attached and a session at the end.

  • A fixed end date: a usability session, a stakeholder review, or a demo
  • One riskiest assumption the prototype must let people encounter
  • Bounded agent runs between human gates, not one long unsupervised run
  • The output is evidence plus a handoff packet, not a product

The deadline is not a constraint on quality. It is the thing that forces every scope decision to be made out loud.

Slide notes

The temptation is to treat a sprint as a special event with its own machinery. Resist that. The sprint is the same designer-agent loop the course has used throughout, run faster because the brief, harness, and QA setup already exist. What the calendar adds is a forcing function: the session date is fixed, so every disagreement about scope has to be resolved by cutting scope rather than slipping the date.

The riskiest-assumption framing comes from the interactive prototype sprint workflow. The single most common failure is prototyping the whole feature; the fix is to write down the one belief that, if wrong, changes the design, and to build only the path that lets a participant encounter it. That sentence also gives the sprint its stop condition — done is when a participant can hit the assumption and react, not when the screens look finished.

The last bullet sets up the second half of the module. The sprint's deliverable is not the prototype; it is the answer to the question plus a handoff packet that records how the answer was obtained and what it does not cover. Teams that treat the demo as the deliverable end up with prototypes nobody can safely reuse and findings nobody wrote down.

Narration for this slide

First, what a sprint actually is. It is not a new process — it is the loop you already know, run at full speed because the harness, the brief format, and the QA setup are already in place. What changes is the calendar. There is a fixed date: a usability session, a stakeholder review, a demo. There is one riskiest assumption the prototype must let people encounter. The agent runs are bounded, with human gates between them, exactly as before. And the output is not a product. It is evidence — plus a handoff packet that records how you got it. The deadline is doing real work here: it forces every scope decision to be made out loud.

Slide 3 of 1316:9

The sprint plan: scope, runs, gates, and the demo moment

Plan the sprint backwards from the session, in four decisions made before any agent runs.

  • Scope: the riskiest assumption, the happy path through it, and an explicit out-of-scope list
  • Runs: which bounded agent runs happen, in what order — build loop, variant fan-out, QA sweep
  • Gates: where a human reviews before the next run starts — plan review, mid-sprint critique, pre-session check
  • The demo moment: who sees it, on what device, against which question

If the demo moment is not defined on day one, the sprint optimises for looking finished instead of answering the question.

Slide notes

Planning backwards from the session is the discipline that makes the rest of the week calm. The session decides the device width the QA sweep targets, the data the prototype needs to feel real, and the variants worth building. A prototype tested on a 1280-pixel laptop in a moderated session needs different QA from one sent unmoderated to phones.

The runs-and-gates structure mirrors the loop from earlier modules but at sprint scale. The build loop is one bounded run: build the happy path, capture, compare against the brief, fix, repeat until it holds. The variant fan-out is a second run that only starts once the base is stable, and each variant changes only what its question requires. The QA sweep is a third run, blocker-only, at the session's device width. Between each run sits a human gate, and the mid-sprint critique gets its own slide because it is the gate teams most often skip.

The out-of-scope list deserves emphasis: authentication, settings, error handling beyond what the flow needs, responsive breakpoints the session will never use, real backends. Writing them down is what gives the agent permission to stub them with plain placeholders instead of inventing production machinery, and what gives the team permission not to feel guilty about the gaps — they are recorded, and they reappear in the handoff as known gaps.

Narration for this slide

The sprint plan is four decisions, made before any agent runs, and made backwards from the session. Scope: the riskiest assumption, the happy path through it, and a written out-of-scope list — auth, settings, error handling, anything the session will not touch. Runs: the bounded agent runs in order — the build loop, the variant fan-out, the QA sweep. Gates: where you review before the next run starts. And the demo moment: who sees the prototype, on what device, against which question. That last one matters more than it sounds. If you do not define the demo moment on day one, the sprint quietly optimises for looking finished instead of answering the question.

Slide 4 of 1316:9

One week, ending in a packet

The sprint ends in a handoff packet, not a demo. Each day's output feeds the next, and Friday's output is what survives.

Timeline diagram of a one-week prototype sprint. Five day cards run left to right: Monday brief and harness check, Tuesday agent-run happy-path build loop, Wednesday mid-sprint critique gate followed by variant fan-out, Thursday blocker-only visual QA and testing sessions, Friday demo and packet assembly. An arrow drops from Friday into a handoff packet row of four cards — the prototype, the spec and findings, the tokens used, and the known gaps — each labelled with what engineering does with it: walks the prototype as reference behaviour without copying code, builds from the spec with the criteria as a QA checklist, maps tokens to production components and decides unmapped values, and sizes the production work from the gaps.
Monday and Thursday are human-led, Tuesday is agent-run, Wednesday is the critique gate, and Friday produces the packet: prototype, spec and findings, tokens used, and known gaps — each with a defined use on the engineering side.

The week is shaped so the expensive human attention lands at the gates, and the agent absorbs the production hours in between.

Slide notes

Walk the top row first. Monday is human-led: write or finalise the brief, name the riskiest assumption, confirm the harness runs end to end with one command, and fix the demo and testing dates. Tuesday is the agent's day: the bounded build loop — build the happy path from the harness, capture screenshots, compare against the brief, fix, repeat. Wednesday opens with the mid-sprint critique gate, then fans out two or three variants from the working base. Thursday is the blocker-only QA sweep at the session's device width, followed by the testing sessions themselves. Friday is the demo and the packet.

Then walk the drop into the bottom row, because this is the module's argument in one picture: the packet is what survives the sprint. Four things go in it. The prototype itself, behind an index page, labelled by the question each variant answers — engineering walks it as reference behaviour and never copies the code. The spec and findings — states, edge cases, acceptance criteria, and what testing showed — engineering builds from the spec and turns the criteria into the QA checklist. The tokens and components used, with unmapped values flagged — engineering maps them to production components and makes a named decision about every unmapped value. And the known gaps — faked data, skipped states, open questions with owners — which is what lets engineering size the production work without inheriting silent guesses.

Be clear that the five-day shape is illustrative, not sacred. The build loop in the source workflow takes a half day once a harness exists; a week is what it becomes when testing sessions, stakeholder calendars, and the handoff are included. Compressing it to two or three days changes the dates, not the sequence.

Narration for this slide

Here is the whole module in one picture. Monday is yours: brief, riskiest assumption, harness check, dates locked. Tuesday belongs to the agent: the build loop — build, capture, compare, fix — until the happy path holds. Wednesday opens with the critique gate, then fans out two or three variants that each change one thing. Thursday is a blocker-only QA pass and then the testing sessions. Friday is the demo and the packet. And the packet is the point. Four things go in it: the prototype, the spec and findings, the tokens used, and the known gaps — each with a defined job on the engineering side. The prototype is reference behaviour, never copied code. The spec is what they build from. The tokens map to production components. The gaps size the real work.

Slide 5 of 1316:9

Day by day: who does what

The same week as a table: each day has an owner, an output, and a gate that decides whether the next day starts.

DayOwnerOutputGate before the next day
MondayDesignerBrief, riskiest assumption, harness verified, dates fixedPlan review: does the plan answer the question?
TuesdayAgent (build loop)Happy path working in the harness, screenshots vs briefDoes the flow hold end to end?
WednesdayDesigner, then agentMid-sprint critique notes; 2–3 variants from the baseDoes each variant change exactly one thing?
ThursdayAgent, then designerBlocker-only QA per variant; session findingsDid participants encounter the assumption?
FridayDesignerDemo, handoff packet, archive decisionBuilt, faked, and unknown all written down?

Every gate is a question with a yes-or-no answer. If the answer is no, the fix is cutting scope, not slipping the session.

Slide notes

The table makes the ownership pattern visible: human, agent, human-then-agent, agent-then-human, human. The expensive human attention lands on Monday, Wednesday morning, Thursday afternoon, and Friday — roughly a day and a half of designer time across the week. The agent absorbs the production hours in between, which is the entire economic argument for the workflow.

The gate column is worth reading aloud, because each gate is a question with a binary answer and a defined remedy. If Tuesday's flow does not hold, Wednesday morning's critique becomes a scope-cutting session rather than a variant-planning one. If a variant changes more than one thing, it gets pulled back to the base rather than polished — variants that differ in everything teach nothing. If participants did not encounter the assumption on Thursday, that is itself a finding worth recording: the flow buried the thing the team needed to learn about.

Flag the honest caveat about timing. The half-day estimate in the underlying workflow assumes a harness already exists — tokens wired in, a usable subset of components, realistic sample data, a one-command dev server. If there is no harness, building a minimal one is the real output of the first sprint, and this table describes the second sprint onwards.

Narration for this slide

Here is the same week as a table, and the pattern to notice is the owner column: human, agent, human then agent, agent then human, human. Your attention lands at the gates — about a day and a half across the week — and the agent absorbs the build hours in between. Each gate is a yes-or-no question. Does the plan answer the question? Does the flow hold? Does each variant change exactly one thing? Did participants actually hit the assumption? Is built, faked, and unknown all written down? When the answer is no, the remedy is always the same: cut scope. The session date does not move. And one caveat — this rhythm assumes the harness already exists. If it does not, building it is your first sprint's real output.

Slide 6 of 1316:9

Mid-sprint critique: catching drift while it is cheap

Wednesday morning, before variants exist, is the last point where a correction costs hours instead of the session.

  • Walk the happy path against the brief, not against taste
  • Check the assumption is actually encounterable, not implied by a static screen
  • Look for invented scope: features, states, or polish nobody asked for
  • Decide cuts now: anything broken or missing gets cut or stubbed, the date holds
  • Recurring corrections go into the harness or the builder agent, not the conversation

The mid-sprint critique exists to protect Thursday's session, not to make the prototype better.

Slide notes

The mid-sprint critique is the gate teams skip most often, usually with the reasoning that the build is going well and stopping would slow it down. The cost of skipping it shows up Thursday morning, when the QA pass discovers the prototype answers a slightly different question from the brief and there is no time left to fix that.

The critique has a narrow job: protect the session. That means walking the happy path as a participant would, on the device the session will use, and checking three things. First, fidelity to the brief — the criteria from Module 1's fidelity decisions, not general polish. Second, that the riskiest assumption is genuinely encounterable: a participant can do the thing and react, rather than look at a static representation of it. Third, invented scope — agents drift towards completeness, and a half-built settings page or an uninvited error-handling system is scope that costs QA time without buying evidence.

The last bullet ties back to the harness discipline from earlier in the course. If the critique keeps producing the same correction — wrong empty-state pattern, off-system spacing, a component the agent keeps rebuilding instead of importing — that correction belongs in the harness or the builder agent definition, so the next sprint does not pay for it again. The critique notes themselves go into the packet later; they are early evidence of what was deliberately left rough.

Narration for this slide

Wednesday morning is the gate that earns its keep. The build went well on Tuesday, the temptation is to keep going — and the cost of skipping the critique arrives on Thursday, when there is no time left to act on it. The critique's job is narrow: protect the session. Walk the happy path the way a participant will, on the device they will use. Check it against the brief, not against taste. Make sure the riskiest assumption is actually encounterable — something people do, not something they look at. Hunt for invented scope, because agents drift towards completeness you did not ask for. Then make the cuts. The date holds; the scope moves. And anything you find yourself correcting twice goes into the harness, not into the chat.

Slide 7 of 1316:9

Testing the prototype: tasks, observations, findings

The session tests the assumption, not the prototype. Findings are written against the brief, with evidence attached.

  • Tasks come from the happy path: the participant tries to do the thing, unprompted where possible
  • Observations are concrete: what they did, said, hesitated on — not whether they liked it
  • Findings answer the riskiest assumption first; everything else is secondary
  • Capture evidence as you go: recordings, completion notes, the moment of reaction
  • Polish complaints about known-faked parts are logged, not treated as findings

A finding is a sentence about the assumption, with evidence. A reaction to a rough edge you already knew about is not a finding.

Slide notes

The session design is mostly standard usability practice and this slide does not pretend to teach moderation. What is specific to agentic prototyping is the relationship between the brief and the findings. Because the brief named the riskiest assumption, the session has a primary question, and the findings document leads with the answer to it — supported by what participants did, not by a vote on whether they liked the prototype.

The pricing-configurator case from the source workflow is a useful concrete shape: six customer interviews, the assumption being that customers would understand usage-based pricing if they could see the price respond to their own inputs. Five of six completed the flow unprompted, and the breakdown-by-default variant produced noticeably fewer trust questions. That is a finding: a sentence about the assumption, with evidence, that settled an argument the team had been having in the abstract.

The last bullet protects the team from a predictable trap. The prototype has known-faked parts — that is the fidelity contract from Module 1 — and participants will sometimes react to them. A complaint about placeholder copy in a stubbed settings page is not a finding; it is noise from a part of the prototype that exists only to keep the flow walkable. Log it, because it occasionally turns out to matter, but do not let it dilute the readout. Stakeholder reviews follow the same rules as user sessions here: the demo walks the question, not the feature.

Narration for this slide

Thursday afternoon, the prototype meets people. The thing being tested is the assumption, not the prototype. Tasks come straight from the happy path — the participant tries to do the thing, unprompted where you can manage it. Observations are concrete: what they did, where they hesitated, what they said at the moment the assumption showed up. Findings lead with the answer to the riskiest assumption, with evidence attached — like the pricing sprint where five of six customers completed the flow and the breakdown-by-default variant drew fewer trust questions. That sentence settled the argument. And one discipline: when someone reacts to a part you knowingly faked, log it, but do not call it a finding. You already knew that edge was rough.

Slide 8 of 1316:9

The honest handoff: built, faked, unknown, production cost

The handoff is one document with four sections. Its job is to make sure nobody downstream inherits a guess without knowing it.

  • Built: what genuinely works — flows, states, components composed from the system, at which widths
  • Faked: hard-coded data, stubbed screens, skipped permissions, simulated latency, placeholder copy
  • Unknown: what the sprint could not test — segments not in the room, scale, feasibility, the data model
  • Learned: the findings against the assumption, with the evidence linked
  • Production cost: an honest sketch of what building it properly requires, including what gets rebuilt rather than reused

The handoff is judged by one test: could someone scope the production work from it without talking to you, and without inheriting a single silent guess?

Slide notes

This is the slide the module is named for. The four-way split — built, faked, unknown, plus what was learned — exists because each category fails differently when it is left implicit. Unrecorded faked parts become production bugs: the hard-coded date range, the permissions that were never checked, the three-item list that never met forty items. Unrecorded unknowns become unfounded confidence: the sprint proved participants could complete the flow with sample data in a moderated session, and nothing more — not feasibility, not performance, not the segments who were not in the room. And unrecorded learnings evaporate; the prototype gets archived and three months later someone re-litigates the decision the sprint already settled.

Production cost is the section designers are most tempted to soften, and the prototype-first article is blunt about why it must not be: a polished prototype creates false confidence, and the promotion decision belongs to a review, not to momentum. The honest sketch usually says that the patterns are validated, the code is not the base, the real work involves real data, real states, accessibility, and integration — and names which existing components the production build should reuse instead of the prototype's local stand-ins.

Date-stamp the document and name the disposal decision in it: this prototype is archived after the readout, and no code is promoted without a promotion review. That sentence, written down with an owner, is what keeps the next sprint cheap — the moment prototype code starts shipping, every future prototype inherits production caution.

Narration for this slide

Now the handoff itself — one document, four honest sections. Built: what genuinely works, and at which widths. Faked: every hard-coded value, stubbed screen, and skipped permission, written down so it cannot become a production bug by accident. Unknown: what the sprint could not test — scale, feasibility, the people who were not in the room. Learned: the findings, with evidence linked. And production cost: an honest sketch of what building it properly requires, including the parts that get rebuilt rather than reused. The test for the whole document is simple. Could someone scope the production work from it without talking to you — and without inheriting a single silent guess? If yes, the handoff is done. And write the disposal decision into it: archived after the readout, nothing promoted without review.

Slide 9 of 1316:9

What engineering actually needs from a prototype handoff

Each part of the packet has a specific job on the engineering side. None of those jobs is 'merge the prototype'.

Packet itemWhat engineering does with it
The prototype, behind an index pageWalks it as reference behaviour: timing, interaction feel, content depth — never copies the code
Spec: states, edge cases, acceptance criteriaBuilds from it; the criteria become the QA checklist for the production PRs
Tokens and components usedMaps them to production components; every unmapped value gets a named decision
Known gaps and faked partsSizes the real work; nothing faked in the prototype is assumed to exist
Open questions with ownersGets answers before the build, in one session, instead of guessing screen by screen

The prototype is the question answered. The spec is what gets built. Confusing the two is how prototype code ends up in production.

Slide notes

This table is where the course's prototype thread meets the handoff-and-spec workflow. The instinct to hand engineering the prototype repository and call it a handoff is strong, because the prototype looks like most of the work. It is not. The prototype encodes behaviour and intent; the spec encodes what to build. Engineering needs both, for different reasons, and the packet keeps them clearly separated.

Walk the rows. The prototype is reference behaviour — the thing a developer opens to feel how the interaction should respond, what the content depth looks like, how the variant that won actually behaves. The spec carries what a developer would otherwise guess: states for every interactive element, edge cases, token mapping, and acceptance criteria written as testable statements. In the source workflow, generating that spec is itself an agent fan-out — one agent per screen, a consolidation pass, then a developer-perspective review that reads the packet as if it had to build it with no access to the designer. That adversarial pass is what turns a packet that looks complete into one that is.

The open-questions list deserves a defence, because it looks like a list of failures. It is the most valuable page in the packet: every entry is a guess that someone would otherwise have made silently, now turned into a decision with an owner and a date. Budget the answer session — typically an hour or two — as part of the sprint, not as follow-up. The handoff also does not replace the relationship: keep one walkthrough call in the plan. The packet makes that call short; it does not make it unnecessary.

Narration for this slide

So what does engineering actually need? Not the prototype repository. Each item in the packet has a specific job. The prototype is reference behaviour — they walk it to feel the interaction, and they never copy the code. The spec is what they build from, and its acceptance criteria become the QA checklist. The token and component list maps to production components, and every value that bypassed the system gets a named decision instead of a quiet guess. The known gaps size the real work — nothing faked in the prototype is assumed to exist. And the open questions go to the designer in one answer session before the build, instead of being guessed screen by screen. The prototype is the question answered. The spec is what gets built. Keep those two things separate and prototype code never leaks into production.

Slide 10 of 1316:9

Worked example: a sprint retrospective with real timings

A pricing-configurator sprint, run the week of six customer interviews. Timings are from the traced run, not a target.

StageTimeWhat happened
Brief and harness check~1.5 hoursQuestion, riskiest assumption, happy path, out-of-scope list; harness ran with one command
Happy-path build loop~90 minutes, 2 passesPlan, sliders, live estimate with breakdown, built from harness components and sample data
Critique and variant fan-out~2 hoursOne scope cut (annual-billing toggle stubbed); two variants: breakdown by default vs on demand
QA and sessions1 fix + 6 interviewsSlider label overflow caught at 1280px; five of six customers completed unprompted
Handoff packet~2 hoursFindings, faked list (static tiers, no auth, no tax), spec for the winning variant, archive decision

Roughly a day of designer attention and an afternoon of agent build time bought an answer the team had argued about for a month.

Slide notes

The example is the pricing-configurator case from the interactive prototype sprint workflow, stretched across the week so the testing sessions and the handoff are visible rather than compressed into the build afternoon. Say clearly that the timings are from one traced run with a harness already in place — they are evidence the shape works, not a benchmark to hold a team to.

The details worth dwelling on: the build loop took two passes and about ninety minutes because the harness supplied tokens, components, and sample data, so the agent assembled rather than invented. The critique cut one piece of invented scope — an annual-billing toggle the brief never asked for — and that cut is exactly what the Wednesday gate exists to make. The QA pass caught a slider label overflowing at the 1280-pixel width the interviews would use; it would have been the first thing every participant saw. The sessions answered the assumption: five of six customers completed the flow unprompted, and the breakdown-by-default variant drew noticeably fewer trust questions, which settled a month of abstract argument.

The handoff is the part the original case study barely mentions and this module insists on. The faked list named the static pricing tiers, the absent authentication, and the missing tax handling. The spec covered the winning variant's states and acceptance criteria. The archive decision was written down, with an owner — and the prototype was archived the following week, unpromoted, exactly as the brief said it would be. The production build started from the spec and the design system, not from the prototype code.

Narration for this slide

Let's trace a real one. A SaaS team, six customer interviews booked, and a month-old argument about whether customers would understand usage-based pricing if they could see the price respond to their own inputs. The brief and harness check took about an hour and a half. The happy-path build loop: two passes, roughly ninety minutes, because the harness supplied the parts. The critique cut one piece of invented scope and fanned out two variants — breakdown shown by default, or revealed on demand. QA caught a label overflowing at exactly the laptop width the interviews used. Five of six customers completed the flow unprompted, and the default-breakdown variant drew fewer trust questions. The handoff took two hours: findings, the faked list, a spec for the winning variant, and a written archive decision. About a day of designer attention, an afternoon of agent time — and a month-long argument settled.

Slide 11 of 1316:9

What the sprint cannot prove

A good sprint earns confidence about one assumption, in one session, with sample data. Claiming more than that undoes the honesty of the handoff.

  • It cannot validate feasibility, performance, or the data model behind the interaction
  • It cannot generalise beyond the participants and tasks that were in the room
  • It cannot make the keep-or-throw-away decision; a person owns the disposal rule
  • It cannot replace production design: the prototype answers a question, the system ships the answer

The moment prototype code starts shipping, the next sprint inherits production caution — and stops being cheap.

Slide notes

This slide is the limits section, and it matters because the sprint's biggest risk is its own success. A prototype that demos beautifully and tests well generates momentum, and momentum is exactly the force that turns throwaway evidence into unreviewed production code. The prototype-first article calls this out directly: a polished prototype creates false confidence, and the promotion decision is a design and engineering decision made in a review, not something the prototype earns by looking finished.

The four limits are worth stating without hedging. Feasibility, performance, and the data model were never tested — the data was sample-shaped, the backend did not exist, and the latency was whatever the dev server happened to do. The findings generalise only to the participants and tasks in the session; the segments who were not in the room remain unknowns, and they are listed as such in the handoff. The disposal rule is a human decision with authority behind it, made after a good session, when keeping the code is most tempting. And production design still happens: the patterns that survived review get rebuilt in the product's architecture, with real states, accessibility, and integration — the promotion review from the prototype-first article is the gate that protects that.

The economic argument for the disposal rule is the one that lands with sceptics: the sprint is cheap precisely because the prototype is disposable. The first time prototype code ships, every future sprint slows down to production caution, and the half-day build becomes a week.

Narration for this slide

Before the exercise, the limits — because the sprint's biggest risk is its own success. A good sprint proves one thing: that participants could complete the flow and how they reacted to the assumption, in a session, with sample data. It does not prove feasibility, performance, or that the data model holds. It does not generalise to the people who were not in the room. It does not make the keep-or-throw-away decision — a person with authority owns that, and they have to make it right after a good session, when keeping the code feels most reasonable. And it does not replace production design. Here is the economic version of the argument: the sprint is cheap because the prototype is disposable. The first time prototype code ships, every sprint after it inherits production caution — and stops being cheap.

Slide 12 of 1316:9

Exercise: plan a sprint for one feature, on one page

Take one feature from your current work and plan its sprint on a single page. Do not run it yet — the plan is the deliverable.

  • Name the design question and the riskiest assumption behind it, each in one sentence
  • Write the happy path in 3–6 steps and the explicit out-of-scope list
  • Lay out the week: which days are agent runs, which are gates, and when the session happens
  • Define the demo moment: who, what device, which question they are answering
  • Draft the handoff skeleton now: the headings for built, faked, unknown, learned, and production cost

If you completed the earlier exercises, the brief from Module 2 and the QA matrix from Module 5 drop straight into this plan.

Slide notes

The exercise asks for a plan, not a prototype, and the one-page constraint is doing the teaching: a sprint that cannot be planned on one page is scoped too wide, and the participant finds that out now rather than on the Wednesday of the sprint. Steer people towards one feature on one surface — a configurator, a navigation pattern, a single flow — not a product area.

The step that surprises people is drafting the handoff skeleton before anything is built. Writing the headings — built, faked, unknown, learned, production cost — before the sprint starts changes how the sprint is run: the faked list becomes a live document the team adds to as it fakes things, instead of an archaeology exercise on Friday afternoon, and the unknown list keeps the session honest about what it can and cannot answer.

If this is being run as a cohort, have participants swap plans and apply the gates from the day-by-day table as review questions: is the assumption encounterable, does each variant change exactly one thing, could the session date hold if the build slips a day. Plans that survive that review are usually plans worth running, and the natural follow-up is to actually run the sprint within the next fortnight while the harness, brief, and QA setup from the earlier modules are still fresh.

Narration for this slide

Your turn. Take one feature from your actual work — one feature, one surface — and plan its sprint on a single page. Name the design question and the riskiest assumption, one sentence each. Write the happy path in three to six steps, and the out-of-scope list right next to it. Lay out the week: agent runs, gates, and the session date. Define the demo moment — who sees it, on what device, answering which question. And draft the handoff skeleton now, before anything exists: built, faked, unknown, learned, production cost. Writing those headings first changes how you run the sprint. If you did the earlier exercises, your Module 2 brief and your Module 5 QA matrix drop straight in. One page. If it does not fit, the scope is wrong.

Slide 13 of 1316:9

Summary, and where to go next

  • The sprint is the course's loop on a clock: scope to one assumption, bounded runs, human gates, a fixed session
  • The week's shape: brief and harness, build loop, critique and variants, QA and testing, demo and packet
  • Findings are written against the riskiest assumption, with evidence — not as a vote on the prototype
  • The handoff separates built, faked, unknown, learned, and production cost; the spec, not the prototype, is what engineering builds from
  • The prototype is archived after the readout; promotion is a reviewed decision, never momentum

This closes Agentic Prototyping. The natural next steps are the design systems course, which builds the harness this course assumed, and the review and critique course, which deepens the gates.

Slide notes

Recap by walking the loop one last time at sprint scale, and point back to where each piece was taught: the fidelity and disposability discipline from Module 1, the brief from Module 2, the variant thinking from Module 3, parity habits from Module 4, and the QA matrix from Module 5. The sprint added the calendar, the testing session, and the handoff — the parts that turn a fast prototype into evidence a team can act on.

Close the course by being honest about what it assumed. The half-day build economics rest on a harness — tokens, components, sample data — that somebody has to build and maintain, and the gates rest on critique skills that improve with deliberate practice. Both have their own courses in the curriculum: the design systems course covers making a system agents can actually compose from, and the review and critique course goes deeper on the gates this course leaned on. Designers working mostly in Claude Code may also want the platform-specific course for the orchestration mechanics the workflows here referenced.

If participants did the exercises throughout, they now have a scoped prototype, an annotated reference, three direction briefs, a parity round, a QA matrix, and a one-page sprint plan. The most useful thing to do next is run the sprint they just planned, within a fortnight, and judge it against this module's gates rather than against how impressive the demo felt.

Narration for this slide

Let's close the course. The sprint is everything you learned, run on a clock: scope to one riskiest assumption, bounded agent runs with human gates, a session that produces findings against the brief, and a handoff that separates built, faked, unknown, learned, and production cost. The spec is what engineering builds from; the prototype is reference behaviour and gets archived after the readout. Promotion is a reviewed decision — never momentum. From here, two directions are worth your time: the design systems course, which builds the harness this course kept assuming, and the review and critique course, which sharpens the gates. But before either of those — run the sprint you just planned. The plan is on one page. The session is one calendar invite away. Thanks for taking the course.

Module transcript
Module 6, narrated slide by slide

Slide 1Interactive Prototype Sprint and Handoff

Welcome to the final module of Agentic Prototyping. Everything you have built so far — the fidelity decisions, the briefs, the direction exploration, the parity loops, the QA matrix — comes together here in a timed sprint. We will walk the shape of a one-week sprint day by day, look at where the gates sit, test the prototype against the assumption it exists to check, and then do the part most teams skip: the handoff. Not a victory lap, but an honest record of what was built, what was faked, what we learned, and what production would actually cost. Let's start with what a sprint really is.

Slide 2The sprint is the loop run at full speed

First, what a sprint actually is. It is not a new process — it is the loop you already know, run at full speed because the harness, the brief format, and the QA setup are already in place. What changes is the calendar. There is a fixed date: a usability session, a stakeholder review, a demo. There is one riskiest assumption the prototype must let people encounter. The agent runs are bounded, with human gates between them, exactly as before. And the output is not a product. It is evidence — plus a handoff packet that records how you got it. The deadline is doing real work here: it forces every scope decision to be made out loud.

Slide 3The sprint plan: scope, runs, gates, and the demo moment

The sprint plan is four decisions, made before any agent runs, and made backwards from the session. Scope: the riskiest assumption, the happy path through it, and a written out-of-scope list — auth, settings, error handling, anything the session will not touch. Runs: the bounded agent runs in order — the build loop, the variant fan-out, the QA sweep. Gates: where you review before the next run starts. And the demo moment: who sees the prototype, on what device, against which question. That last one matters more than it sounds. If you do not define the demo moment on day one, the sprint quietly optimises for looking finished instead of answering the question.

Slide 4One week, ending in a packet

Here is the whole module in one picture. Monday is yours: brief, riskiest assumption, harness check, dates locked. Tuesday belongs to the agent: the build loop — build, capture, compare, fix — until the happy path holds. Wednesday opens with the critique gate, then fans out two or three variants that each change one thing. Thursday is a blocker-only QA pass and then the testing sessions. Friday is the demo and the packet. And the packet is the point. Four things go in it: the prototype, the spec and findings, the tokens used, and the known gaps — each with a defined job on the engineering side. The prototype is reference behaviour, never copied code. The spec is what they build from. The tokens map to production components. The gaps size the real work.

Slide 5Day by day: who does what

Here is the same week as a table, and the pattern to notice is the owner column: human, agent, human then agent, agent then human, human. Your attention lands at the gates — about a day and a half across the week — and the agent absorbs the build hours in between. Each gate is a yes-or-no question. Does the plan answer the question? Does the flow hold? Does each variant change exactly one thing? Did participants actually hit the assumption? Is built, faked, and unknown all written down? When the answer is no, the remedy is always the same: cut scope. The session date does not move. And one caveat — this rhythm assumes the harness already exists. If it does not, building it is your first sprint's real output.

Slide 6Mid-sprint critique: catching drift while it is cheap

Wednesday morning is the gate that earns its keep. The build went well on Tuesday, the temptation is to keep going — and the cost of skipping the critique arrives on Thursday, when there is no time left to act on it. The critique's job is narrow: protect the session. Walk the happy path the way a participant will, on the device they will use. Check it against the brief, not against taste. Make sure the riskiest assumption is actually encounterable — something people do, not something they look at. Hunt for invented scope, because agents drift towards completeness you did not ask for. Then make the cuts. The date holds; the scope moves. And anything you find yourself correcting twice goes into the harness, not into the chat.

Slide 7Testing the prototype: tasks, observations, findings

Thursday afternoon, the prototype meets people. The thing being tested is the assumption, not the prototype. Tasks come straight from the happy path — the participant tries to do the thing, unprompted where you can manage it. Observations are concrete: what they did, where they hesitated, what they said at the moment the assumption showed up. Findings lead with the answer to the riskiest assumption, with evidence attached — like the pricing sprint where five of six customers completed the flow and the breakdown-by-default variant drew fewer trust questions. That sentence settled the argument. And one discipline: when someone reacts to a part you knowingly faked, log it, but do not call it a finding. You already knew that edge was rough.

Slide 8The honest handoff: built, faked, unknown, production cost

Now the handoff itself — one document, four honest sections. Built: what genuinely works, and at which widths. Faked: every hard-coded value, stubbed screen, and skipped permission, written down so it cannot become a production bug by accident. Unknown: what the sprint could not test — scale, feasibility, the people who were not in the room. Learned: the findings, with evidence linked. And production cost: an honest sketch of what building it properly requires, including the parts that get rebuilt rather than reused. The test for the whole document is simple. Could someone scope the production work from it without talking to you — and without inheriting a single silent guess? If yes, the handoff is done. And write the disposal decision into it: archived after the readout, nothing promoted without review.

Slide 9What engineering actually needs from a prototype handoff

So what does engineering actually need? Not the prototype repository. Each item in the packet has a specific job. The prototype is reference behaviour — they walk it to feel the interaction, and they never copy the code. The spec is what they build from, and its acceptance criteria become the QA checklist. The token and component list maps to production components, and every value that bypassed the system gets a named decision instead of a quiet guess. The known gaps size the real work — nothing faked in the prototype is assumed to exist. And the open questions go to the designer in one answer session before the build, instead of being guessed screen by screen. The prototype is the question answered. The spec is what gets built. Keep those two things separate and prototype code never leaks into production.

Slide 10Worked example: a sprint retrospective with real timings

Let's trace a real one. A SaaS team, six customer interviews booked, and a month-old argument about whether customers would understand usage-based pricing if they could see the price respond to their own inputs. The brief and harness check took about an hour and a half. The happy-path build loop: two passes, roughly ninety minutes, because the harness supplied the parts. The critique cut one piece of invented scope and fanned out two variants — breakdown shown by default, or revealed on demand. QA caught a label overflowing at exactly the laptop width the interviews used. Five of six customers completed the flow unprompted, and the default-breakdown variant drew fewer trust questions. The handoff took two hours: findings, the faked list, a spec for the winning variant, and a written archive decision. About a day of designer attention, an afternoon of agent time — and a month-long argument settled.

Slide 11What the sprint cannot prove

Before the exercise, the limits — because the sprint's biggest risk is its own success. A good sprint proves one thing: that participants could complete the flow and how they reacted to the assumption, in a session, with sample data. It does not prove feasibility, performance, or that the data model holds. It does not generalise to the people who were not in the room. It does not make the keep-or-throw-away decision — a person with authority owns that, and they have to make it right after a good session, when keeping the code feels most reasonable. And it does not replace production design. Here is the economic version of the argument: the sprint is cheap because the prototype is disposable. The first time prototype code ships, every sprint after it inherits production caution — and stops being cheap.

Slide 12Exercise: plan a sprint for one feature, on one page

Your turn. Take one feature from your actual work — one feature, one surface — and plan its sprint on a single page. Name the design question and the riskiest assumption, one sentence each. Write the happy path in three to six steps, and the out-of-scope list right next to it. Lay out the week: agent runs, gates, and the session date. Define the demo moment — who sees it, on what device, answering which question. And draft the handoff skeleton now, before anything exists: built, faked, unknown, learned, production cost. Writing those headings first changes how you run the sprint. If you did the earlier exercises, your Module 2 brief and your Module 5 QA matrix drop straight in. One page. If it does not fit, the scope is wrong.

Slide 13Summary, and where to go next

Let's close the course. The sprint is everything you learned, run on a clock: scope to one riskiest assumption, bounded agent runs with human gates, a session that produces findings against the brief, and a handoff that separates built, faked, unknown, learned, and production cost. The spec is what engineering builds from; the prototype is reference behaviour and gets archived after the readout. Promotion is a reviewed decision — never momentum. From here, two directions are worth your time: the design systems course, which builds the harness this course kept assuming, and the review and critique course, which sharpens the gates. But before either of those — run the sprint you just planned. The plan is on one page. The session is one calendar invite away. Thanks for taking the course.