AAgentic Design School
Module 3 of 6
40–50 minutes

Agentic Prototyping

Multi-Direction Concept Exploration

Using agents to explore genuinely different directions in parallel — not five fonts on the same layout — and converging through structured comparison rather than whichever one looked best at the standup.

Duration40–50 minutes

Slides13 slides with notes and narration

Learning objectives

  • Define direction axes that produce meaningfully different concepts.
  • Run parallel generation without the directions collapsing into each other.
  • Converge using a comparison board with named criteria and evidence.
Slide deck

Work through the module

Each slide is shown in its 16:9 frame, exactly as it appears in the video version. Open the notes under any slide for the longer explanation, and the narration if you prefer to read along.

Slide 1 of 1316:9

Multi-Direction Concept Exploration

Agentic Prototyping · Module 3 of 6

  • Why agent-generated options converge, and what stops it
  • Direction axes and stances that force real difference
  • Running directions in parallel without cross-contamination
  • Converging on a comparison board, not a hallway opinion

Breadth is cheap now. Difference still takes design — and so does the decision at the end.

Slide notes

Module 2 ended with a brief: situation, job, direction, constraints, criteria. This module asks what happens when the right direction is genuinely not known yet — when a single brief would force a premature choice. The answer is not to ask the agent for five options in one prompt. It is to design the exploration: name the directions, run them in isolation, and converge through a comparison the team can stand behind later.

Set the framing early: agents have made breadth nearly free. Drafting three or four working concepts used to cost a week of a designer's production time; it now costs an hour or two of agent runs. What has not become free is difference — getting concepts that disagree with each other in structure rather than styling — and judgment, the decision about which trade-off the product should accept. Those two are the design work this module teaches.

Be clear about scope. The output of an exploration is a decision artifact for a design lead, not production code and not a finished prototype. Modules 4 and 5 cover holding a chosen direction to quality; this module is about choosing it honestly.

Narration for this slide

Welcome to Module 3. So far you have scoped a prototype and turned references and research into a brief. But sometimes the honest answer is that you do not know the right direction yet — and picking one early just because the agent drafted it quickly is how teams end up polishing a concept nobody actually chose. This module is about exploring several genuinely different directions in parallel, and then converging through a structured comparison instead of a hallway opinion. The headline is simple: breadth is cheap now. Difference still takes design. Let's start with why agent-generated options tend to look the same.

Slide 2 of 1316:9

Why agent variants converge

Ask one agent for three options in one prompt and you usually get one idea wearing three outfits.

  • The options share a context window, so they pull towards each other
  • The model's centre of gravity is the most common pattern on the public web
  • The first draft anchors everything after it — including the agent's own critique
  • Self-critique of its own favourite produces praise with softening caveats

Convergence is not a prompting failure you can word your way out of. It is structural, and the fix is structural too.

Slide notes

Name the failure precisely, because most people have seen it without diagnosing it. You ask for three concepts; the agent returns three layouts that share the same screen count, the same first decision, and the same hierarchy, differing in accent colour and copy tone. The reasons are structural. First, options generated in one context window can see each other — the second concept is written with the first one already in context, so it drifts towards being a variation rather than an alternative. Second, every concept is pulled towards the most common pattern in the training distribution: polished, generic, and safe. Third, the first thing on screen anchors the conversation, including any critique you ask the same agent to perform on it.

The self-critique point deserves emphasis. Asking the agent that drafted a concept whether it is any good produces agreement theatre: a positive assessment with two minor suggestions. That is not dishonesty; it is the same anchoring problem, applied to evaluation.

The consequence for the workflow: if you want directions that genuinely differ, the differences have to be designed in before generation, and the critique has to come from somewhere other than the author. That is the rest of the module.

Narration for this slide

Here is the failure this module exists to prevent. You ask an agent for three concepts and you get the same layout three times, with different accent colours. That is not bad prompting — it is structural. Options drafted in one context window can see each other, so they converge. Every option gets pulled towards the most common pattern on the web, which is polished and generic. And if you ask the same agent to critique its own work, you get praise with a couple of polite caveats. You cannot word your way out of this. The fix is structural: design the differences before generation, and separate the critics from the authors.

Slide 3 of 1316:9

When to explore — and when not to

Exploration has a real cost: more agent runs, more reading, and a mandatory comparison step. Spend it where the decision deserves it.

  • Explore when the decision is expensive to reverse and the solution space is genuinely open
  • Explore when the team disagrees, or the decision will be relitigated unless alternatives are on record
  • Skip it when the direction is already constrained to one viable answer
  • Skip it when exploring costs more than just building the thing and looking at it
  • A single well-briefed run is the default; exploration is the escalation

If you cannot name three stances a reasonable designer might defend, the problem may not need an exploration.

Slide notes

This is the same discipline as the single-agent versus multi-agent decision covered in the school's article on using multiple design agents: more agents is not the default, it is an escalation you pay for. An exploration multiplies generation runs, adds a critique stage that cannot be skipped, and produces several documents a human has to actually read. Anthropic's engineering write-up on multi-agent research systems reports such systems using roughly fifteen times the tokens of a single chat interaction — a research figure, not a design benchmark, but the direction of the cost is not in dispute. As of June 2026 the vendor guidance says the same thing: parallel work pays off when the slices are genuinely independent, and not otherwise.

The positive cases are specific: a redesign of a core flow where the team disagrees about the approach, a structural choice such as data density or navigation model, a visual refresh with hard legal or accessibility constraints, or any decision likely to be reopened later unless the alternatives were examined on the record. The exploration's by-product — rejected directions with scores and reasons — is often worth as much as the chosen one.

The negative test is the one on the slide: if you cannot name three stances a reasonable designer might defend, the solution space is not actually open, and a single briefed run plus critique is faster and just as honest.

Narration for this slide

Before we build the workflow, decide whether you need it. Exploration is not free: more runs, more reading, and a comparison step you cannot skip. It earns its cost when the decision is expensive to reverse and the solution space is genuinely open — a core flow redesign, a density or navigation choice, a refresh with hard constraints, or anything the team will relitigate unless the alternatives are on record. Skip it when there is really only one viable answer, or when the question is small enough to just build and look at. Here is the test: if you cannot name three stances a reasonable designer might defend, you probably do not need an exploration.

Slide 4 of 1316:9

Direction axes: where real difference comes from

Genuine difference comes from structural choices, not styling. Pick one or two axes and place each direction at a different point on them.

  • Information density — expert-dense tables vs progressive drill-down vs summary dashboard
  • Navigation model — guided linear flow vs single page vs hub-and-spoke
  • Progressive disclosure — everything visible vs details on demand
  • Starting point — blank slate vs template-first vs data-first
  • Tone and visual register — only after a structural axis is chosen, never instead of one

If two directions would produce the same wireframe, they are one direction with two paint jobs.

Slide notes

An axis is a dimension along which reasonable designers genuinely disagree for this product. The useful ones are structural: how much information is on screen at once, how the user moves through the work, how much is revealed up front, and what the user starts from. A direction is a committed position on one or two of these axes — not a mood board. The wireframe test on the slide is the practical check: sketch each direction's primary screen as boxes; if the boxes are the same, the directions are not different.

Tone and visual register are deliberately last. They are real axes for brand and marketing work, but for product surfaces they are the easiest way to fake difference: three skins over one structure. Use them as a secondary axis once the structural one is fixed, or in explorations whose question really is visual, such as a brand refresh.

For purely visual explorations, named style vocabularies help in the same way structural axes do — the next slide covers that. Either way, the principle is identical: difference is declared before generation, on a named axis, not discovered afterwards by squinting at the outputs.

Narration for this slide

So where does real difference come from? From structural axes — the dimensions where reasonable designers genuinely disagree. Information density: do experts get dense tables, or does everyone get a summary dashboard with drill-down? Navigation: a guided linear flow, a single page, a hub with spokes? Disclosure: everything visible, or details on demand? Starting point: blank slate, template, or the user's own data? Each direction commits to a different position on one or two of these. Tone and visual style come last, never first, because they are the easiest way to fake difference. Quick test: sketch each direction's main screen as boxes. Same boxes, same direction.

Slide 5 of 1316:9

Named directions beat 'make it different'

Open-source design skills have converged on the same trick: give directions names and committed characteristics, then deal them out deterministically.

  • Huashu Design's direction advisor groups 20 design philosophies into 5 named schools
  • When the brief is vague, it recommends three directions from three different schools
  • Each direction arrives with committed characteristics — type, density, motion — not adjectives
  • Demos for the three are generated in parallel, then the human picks or mixes
  • The lesson transfers: name the direction, commit its characteristics, never ask for 'something different'

A named stance with committed characteristics is enforceable. 'Make this one more bold' is not.

Slide notes

This slide grounds the stance idea in something shipping in the wild. Huashu Design — an MIT-licensed, HTML-native design skill for coding agents with around fourteen thousand GitHub stars as of mid-2026 — includes a direction advisor for exactly the situation this module covers: the brief is vague and the style is undecided. It maintains twenty design philosophies grouped into five schools (information architecture, kinetic poetry, minimalism, experimental avant-garde, and an eastern-philosophy school), and when triggered it recommends three philosophies drawn from three different schools, generates demos for them in parallel where the agent supports subagents, and lets the human pick one or mix.

The transferable mechanism is not the specific taxonomy — your product probably does not need 'kinetic poetry'. It is the determinism. Drawing the three candidates from three different schools guarantees spread by construction rather than hoping the model varies. And each named direction carries committed characteristics — palettes, type choices, density rules — which makes it checkable later: a reviewer can say this draft broke its own direction's rules, which is a far stronger critique than this feels samey.

For your own explorations, the equivalent move is the stance paragraph from the workflow this module is built on: a one-paragraph design position each direction must commit to, with its characteristics stated. Naming is what makes the difference enforceable.

Narration for this slide

Here is a pattern worth stealing from the open-source world. Huashu Design, a widely used design skill for coding agents, includes a direction advisor for vague briefs. It keeps twenty design philosophies grouped into five named schools, and when you ask for style options it deliberately recommends three philosophies from three different schools — spread is guaranteed by construction, not by hoping the model varies. Each direction comes with committed characteristics, and demos are generated in parallel for the human to pick or mix. The transferable lesson: name each direction, commit its characteristics up front, and never ask an agent for 'something different'. A named stance is enforceable. An adjective is not.

Slide 6 of 1316:9

One brief, three stances

Every direction agent gets the same brief. What differs is the stance — a one-paragraph position the direction must commit to.

brief.md (excerpt) — shared constraints, distinct stances
## Constraints (all directions)
- Mobile web; existing tokens and component library.
- Legal: data-residency choice before any content is created.
- WCAG 2.2 AA; nothing conveyed by colour alone.

## Anti-patterns
- At most one optional step. No fake progress indicators.

## Stances (one per direction agent)
- Direction A: guided linear flow, one decision per screen.
- Direction B: single-page setup with progressive disclosure.
- Direction C: template-first; pick a working example, then customise.

Shared constraints make the directions comparable. Distinct stances make them different. You need both.

Slide notes

The structure does two jobs at once. The shared part — user, job, constraints, brand rules, anti-patterns, and the scoring rubric — is identical for every direction, which is what makes the outputs comparable later: every direction answered the same question under the same rules. The stance is the only thing that differs, and it is a commitment, not a suggestion. The drafting instructions tell each agent to commit fully to its stance even where it creates trade-offs, and to name those trade-offs rather than hedge towards the middle.

Walk the example briefly: an onboarding redesign with a hard legal constraint (data residency before content creation) and an accessibility floor. The three stances differ on structural axes from two slides ago — navigation model and starting point — and each is something a reasonable designer might argue for in a real meeting. That is the bar for a stance.

Also worth stating: each direction asks for a direction package, not a finished design — a concept statement, the screen-by-screen structure, key interaction decisions, what the direction deliberately sacrifices, a rough effort note, and open questions. A consistent package format is what lets the comparison stage line them up. In a prototyping context you can add a thin working prototype per direction, but the package is what gets compared.

Narration for this slide

Here is the briefing structure that makes an exploration work. Every direction agent receives exactly the same brief: the user, the job, the constraints, the brand rules, the anti-patterns, and the scoring rubric. The only thing that differs is the stance — a one-paragraph design position the direction must fully commit to, even where it creates trade-offs. In this onboarding example, direction A is a guided linear flow, one decision per screen. B is a single page with progressive disclosure. C is template-first. Shared constraints make the outputs comparable. Distinct stances make them different. You need both, and they are both written before any generation starts.

Slide 7 of 1316:9

Running directions in parallel

The brief fans out, the directions are drafted in isolation, and the critiques cross over before a human decides.

Flow diagram showing one shared brief with three named stances fanning out to three direction agents — guided linear, progressive disclosure, and template-first — each drafted in parallel and in isolation, never seeing the others. The three direction packages converge into an adversarial critique stage that produces a comparison board with rubric scores backed by evidence, the strongest objection per direction, disqualifying constraint violations, blend candidates, and open questions, but no winner. The board feeds a human decision to pick, blend, or send a direction back, with rejected directions recorded on file.
One brief, three isolated drafting runs, an adversarial critique stage, and a human decision. Isolation is what keeps the directions different; the scripted critique stage is what keeps the comparison honest.

Each direction is drafted by an agent that never sees the others. Isolation is the mechanism, not a nicety.

Slide notes

Walk the diagram left to right. The brief and stances on the left are human work — that is where the design thinking happens. The middle column is the fan-out: one drafting agent per stance, run in parallel, each given only the shared brief and its own stance, each writing its package to its own file in a scratch folder. The isolation is the load-bearing part: because no direction sees another, they cannot converge on each other, which removes the structural cause of sameness from slide two.

The convergence column is the adversarial critique stage. Every direction is reviewed by agents that did not write it, each arguing against the direction it reviews — its job is to find where the direction fails the brief, not to balance praise and concerns. The output is the comparison board: rubric scores with evidence, the strongest objection per direction, constraint violations (which disqualify rather than score), blend candidates, and open questions. Deliberately, no winner.

Mechanically, in Claude Code this runs well as a dynamic workflow: an orchestration script fans out drafting agents, waits, fans out critics, and returns only the comparison report to the main conversation, keeping intermediate drafts out of the shared context. Saved to .claude/workflows/ it becomes a reusable command. But the pattern survives on any platform — separate sessions or subagents per direction, separate output files, and a scripted or checklisted critique stage all achieve the same isolation. Scratch naming matters too: one folder per exploration, one file per direction, named by stance, so the merge step knows exactly what it is reading.

Narration for this slide

Here is the whole shape of the workflow. On the left, the human work: one brief and three named stances. The brief fans out — one drafting agent per stance, running in parallel, and crucially in isolation. Each agent sees only the shared brief and its own stance, never the other directions, and writes its package to its own file. That isolation is the mechanism that keeps the directions different. Then the critiques cross over: every direction is reviewed by agents that did not write it, scored against a rubric, with the strongest objection recorded. The comparison board deliberately does not pick a winner. That last step — the decision — stays human.

Slide 8 of 1316:9

The comparison board

Critique without a rubric collapses into taste, and agents produce confident taste. Name the criteria, weight them, and require evidence.

CriterionWeightWhat a 5 looks like
Job completion3The user reaches the outcome with the fewest unforced decisions
Constraint fit3Legal, accessibility, platform — violations disqualify, not score low
Clarity of hierarchy2One obvious primary action per screen; the structure explains itself
Effort to ship2Reuses existing components; estimate S or M with believable reasoning
Differentiation1Structurally distinct from the current product and the other stances
Risk honesty1Names what the direction sacrifices and where it could fail

Every score must cite evidence from the direction package or the brief. Scores without evidence are just confidence.

Slide notes

The rubric is written by the human, before generation, and it ships inside the brief so every direction knows how it will be judged. Five to seven criteria is usually right — short enough that a design lead can hold it in their head while reading the comparison report. The weights encode what the product actually values for this decision; if the weights are wrong, the workflow will efficiently rank directions against the wrong standard, which is a human failure the agents cannot catch.

Two rules carry most of the value. First, evidence is required for every score: the critique must cite the direction package or the brief, which is what separates a comparison from a vibe. Second, constraint violations disqualify rather than score low — a direction that puts content creation before the legal data-residency step is not a 2 on constraint fit, it is out, or it goes back for another pass.

The comparison report built from these scores has a fixed shape: the score table with one-line evidence, the strongest objection per direction taken from the critiques, the constraint check, named blend candidates, and the open questions a human must settle. It is built for a decision meeting, not for reading cover to cover. And scores are comparable within one run only — do not compare numbers across different briefs or different runs.

Narration for this slide

Convergence needs a board, and the board needs a rubric — because critique without one collapses into taste, and agents are very good at confident taste. Name five to seven criteria, weight them, and require evidence for every score: a citation from the direction package or the brief. In this example, job completion and constraint fit carry the most weight, and constraint violations do not score low — they disqualify. The report that comes out has the scores, the strongest objection against each direction, blend candidates, and the open questions. One more rule: scores are comparable within a single run, not across briefs. The board informs the decision. It is not the decision.

Slide 9 of 1316:9

Converging without a committee design

The lead picks, blends, or sends a direction back. Blending is allowed — averaging is not.

  • Pick one direction as the spine; do not average three structures into mush
  • Blend named elements only — specific pieces the comparison report identified as worth carrying over
  • Send a direction back for another pass when the objection is fixable, not fatal
  • Record the decision, the reasons, and the rejected directions next to the packages
  • The rejected directions are part of the value: the next 'why aren't we template-first?' has an answer on file

Combining strengths means one spine plus named transplants — not the median of three structures.

Slide notes

The convergence step is where explorations most often go soft. The comfortable move is to declare that all three directions have merit and ask for a blend of the best of each — which produces a committee design: a structure that nobody proposed, that inherits the compromises of all three, and that no critique ever examined. The discipline is to choose one direction as the structural spine and then transplant only named elements from the others, specifically the blend candidates the comparison report called out. In the brand-refresh case from the source workflow, the chosen direction borrowed its colour system from an eliminated one — a named transplant, not an average.

It is also worth saying that the safe-compromise direction often loses on the evidence. In the data-density case study, the hybrid with a density toggle — the option everyone assumed would win — took the lowest weighted score, because three separate critiques showed it doubled the design and QA surface for every future feature, and its own package admitted the toggle existed because the direction would not commit.

Finally, the record. Write down the decision, the reasons, and the rejected directions, and keep them next to the packages. That single page is what stops the decision being relitigated in a quarter, and it is the artifact that makes the whole exploration cheaper the second time, because the stances and rubric get reused.

Narration for this slide

Now the convergence — and the trap inside it. The comfortable move is to say all three have merit, let's blend the best of each. That produces a committee design: a structure nobody proposed and no critique ever examined. The discipline is different. Pick one direction as the spine. Then transplant only named elements from the others — the specific blend candidates the comparison report identified. If a direction's strongest objection is fixable rather than fatal, send it back for another pass instead of forcing the choice. And whatever you decide, record it: the decision, the reasons, and the rejected directions with their scores. That page is what stops the argument restarting next quarter.

Slide 10 of 1316:9

Presenting directions without anchoring on polish

Stakeholders choose with their eyes. If one direction looks more finished, it has already won for the wrong reason.

  • Hold every direction to the same fidelity — same component library, same data, same level of finish
  • Lead with the stance and the trade-off, not the screens
  • Show the comparison board before the visuals, or alongside them — never after
  • Name what each direction sacrifices out loud; do not let the room assume none of them cost anything
  • Keep the recommendation separate from the evidence, and label which is which
  • State what the prototypes fake — Module 1's fidelity declarations apply to every direction equally

Equal polish, stance first, evidence visible. Otherwise the prettiest render wins by default.

Slide notes

Presentation is where an honest exploration can quietly become a rigged one. The most common distortion is uneven fidelity: one direction got an extra hour of agent polish, or happens to use real data while the others use placeholders, and the room anchors on it before a word is spoken. The fix is procedural — every direction is built with the same component library, the same data, and the same time budget, and anything that cannot be equalised is named out loud.

The second distortion is leading with screens. Show four screens before stating the stance and the trade-off, and the conversation becomes about which one people like, which is the hallway opinion this module exists to replace. Lead with what each direction commits to and what it sacrifices, then show the screens as evidence for that position, with the comparison board visible.

Third, keep the recommendation and the evidence separable. It is fine — usually right — for the design lead to walk in with a recommendation, but stakeholders should be able to see the scores, the objections, and the open questions independently of it. And the fidelity honesty from Module 1 applies here with extra force: every direction prototype fakes things, and the things they fake should be the same things, stated on the slide, so nobody mistakes a more complete fake for a better direction.

Narration for this slide

A word on presenting this work, because stakeholders choose with their eyes. If one direction looks more finished than the others, it has already won — for the wrong reason. So hold every direction to the same fidelity: same components, same data, same hours of polish. Lead with the stance and the trade-off, not the screens. Put the comparison board in the room, not in an appendix. Say out loud what each direction sacrifices, and what every prototype fakes — the fidelity declarations from Module 1 apply to all of them equally. And keep your recommendation separate from the evidence, clearly labelled. Equal polish, stance first, evidence visible.

Slide 11 of 1316:9

Worked example: three directions for one onboarding flow

A B2B team with 41% onboarding completion ran the three stances from earlier. Drafting took 22 minutes; the six cross-critiques took another 15.

DirectionStrongest critique findingOutcome
A — guided linearHighest on job completion, but needed 8 screens to satisfy the legal step against 4 in the brief's spiritChosen as the spine
B — progressive disclosureScored well on effort, but the disclosure pattern hid the teammate invite — the step most correlated with retentionRejected, reason recorded
C — template-firstTemplates implied content creation before the data-residency choice — a constraint violation, not a preferenceDisqualified
DecisionGuided linear, with the template gallery pulled in as the final step and two low-value optional screens cutOne-page record kept

The critiques did the work: one direction disqualified on a constraint, one rejected on evidence, and the winner improved by what it borrowed.

Slide notes

This trace comes from the school's published multi-direction exploration workflow; the timings are from that run, not a benchmark, so quote them as indicative. The setup is the brief from slide six: a B2B onboarding redesign, 41% completion, a hard legal constraint on data residency, and three stances on the navigation and starting-point axes. Drafting the three direction packages in parallel took about 22 minutes; the full critique matrix — each direction reviewed by two non-authors — took another 15.

Walk what the critiques actually found, because this is where the structure earns its cost. The guided-linear direction won on job completion but its critics showed the legal step inflated it to eight screens. The single-page direction looked cheapest to ship, but two separate critiques flagged that its disclosure pattern hid the teammate invitation — the step the team's own data tied most strongly to retention. The template-first direction drew the decisive objection: templates implied creating content before the data-residency choice, which is a constraint violation and therefore disqualifying rather than a low score.

The convergence followed the rules from slide nine: guided linear as the spine, the template gallery transplanted in as the final step, two optional screens the critique had marked low-value cut, and a one-page decision record kept next to the packages — including why template-first lost, which is the answer to the question someone will ask in six months.

Narration for this slide

Let's trace a real run. A B2B team with forty-one percent onboarding completion ran the three stances we briefed earlier. Drafting took about twenty-two minutes; the six cross-critiques, fifteen more. The critiques did real work. Guided linear scored highest on job completion, but its critics showed the legal step blew it out to eight screens. Progressive disclosure looked cheap to ship, but two critiques caught that it hid the teammate invite — the step most tied to retention. Template-first was disqualified outright: it put content creation before the legal data-residency choice. The lead chose guided linear, transplanted the template gallery as the final step, cut two low-value screens, and recorded the whole decision on one page.

Slide 12 of 1316:9

Exercise: define three genuinely different directions

Take the brief you wrote in Module 2 — or a current task — and design the exploration on paper before running anything.

  • Decide whether the task deserves an exploration at all; write one sentence either way
  • Pick one or two structural axes and write three stances, each one paragraph, each defensible
  • Apply the wireframe test: sketch each direction's primary screen as boxes and check they differ
  • Write the rubric: five to seven weighted criteria, with what a 5 looks like for each
  • Name the scratch structure: one folder, one package file per direction, named by stance

If the run is worth it, execute it before Module 4 — the chosen direction becomes your parity reference there.

Slide notes

Like the earlier exercises in this course, the first pass is deliberately on paper, because the failure modes are all upstream of the agent: an exploration that was not needed, stances that are really one stance, a rubric that rewards the wrong thing. The first bullet matters most — the honest answer for many tasks is that a single briefed run is enough, and writing that sentence is a legitimate completion of the exercise.

If the task does deserve it, the stances are where to spend the time. Push participants towards structural axes and the wireframe test; the most common weak submission is three stances that differ only in tone. The rubric should be written before looking at any output, and it should include at least one disqualifying constraint, because that is what gives the critique stage teeth.

For those who run it: keep the whole exploration in a scratch folder per Module 1's disposability rules, run the drafting agents in isolation, and do not skip the critique stage even if one direction looks like an obvious winner — the obvious winner losing on evidence is a recurring outcome in real runs. Bring the comparison board and the decision record to Module 4: the chosen direction is the design that the parity work there will hold the implementation to.

Narration for this slide

Your turn. Take the brief you wrote in Module 2, or a live task, and design the exploration on paper first. Step one: decide whether it deserves an exploration at all, and write one sentence saying why or why not — a no is a perfectly good answer. If yes: pick one or two structural axes and write three one-paragraph stances a reasonable designer could defend. Sketch each direction's main screen as boxes and check the boxes actually differ. Write the rubric — five to seven weighted criteria, at least one disqualifying constraint. Name your scratch folders. If it is worth running, run it before Module 4, because the direction you choose becomes the reference there.

Slide 13 of 1316:9

Summary, and the bridge to parity

  • Agent options converge structurally; isolation and named stances are the structural fix
  • Difference comes from structural axes — density, navigation, disclosure, starting point — not styling
  • One shared brief plus one committed stance per direction keeps outputs different and comparable
  • Critique is adversarial, scripted, and rubric-scored with evidence; constraint violations disqualify
  • A human picks or blends named elements, records the decision, and keeps the rejected directions on file

Module 4 takes the chosen direction and holds the implementation to it: parity measured with screenshot evidence, not asserted.

Slide notes

Recap by connecting the mechanism to the failure it prevents. Convergence is structural, so the fixes are structural: directions drafted in isolation so they cannot pull towards each other, stances named in advance so difference is committed rather than hoped for, and critique assigned to non-authors so the evaluation is not anchored on its own draft. The comparison board exists so the decision is made on criteria and evidence; the human decision and the written record exist because the scores are inputs, not the answer.

Restate the cost honesty: this is an escalation, not a default. Most prototyping tasks in this course are single-direction and are better served by the brief-and-critique loop from Module 2. The exploration earns its multiplied runs when the decision is expensive, contested, or likely to be reopened.

Then set up Module 4 concretely. The exploration ends with a chosen direction — a package, possibly a thin prototype, and a decision record. Module 4 is about the next discipline: when the design now exists, holding the implementation to it. That means a parity setup with the reference and the implementation side by side, screenshot evidence rather than assertions, the spacing and state details agents reliably miss, and an iteration loop that converges one named gap at a time. If participants ran the exercise, the direction they chose is the reference they will use.

Narration for this slide

Let's close. Agent-generated options converge for structural reasons, so the fix is structural: named stances, drafted in parallel and in isolation, critiqued adversarially by non-authors, and scored against a rubric where every score needs evidence and constraint violations disqualify. A human converges — picking a spine, blending only named elements, and recording the decision and the rejected directions. And remember this is an escalation: most tasks still want one well-briefed run. In Module 4 the question flips. You have a chosen direction; now the implementation has to match it. Parity, measured with screenshots rather than asserted — that is next.

Module transcript
Module 3, narrated slide by slide

Slide 1Multi-Direction Concept Exploration

Welcome to Module 3. So far you have scoped a prototype and turned references and research into a brief. But sometimes the honest answer is that you do not know the right direction yet — and picking one early just because the agent drafted it quickly is how teams end up polishing a concept nobody actually chose. This module is about exploring several genuinely different directions in parallel, and then converging through a structured comparison instead of a hallway opinion. The headline is simple: breadth is cheap now. Difference still takes design. Let's start with why agent-generated options tend to look the same.

Slide 2Why agent variants converge

Here is the failure this module exists to prevent. You ask an agent for three concepts and you get the same layout three times, with different accent colours. That is not bad prompting — it is structural. Options drafted in one context window can see each other, so they converge. Every option gets pulled towards the most common pattern on the web, which is polished and generic. And if you ask the same agent to critique its own work, you get praise with a couple of polite caveats. You cannot word your way out of this. The fix is structural: design the differences before generation, and separate the critics from the authors.

Slide 3When to explore — and when not to

Before we build the workflow, decide whether you need it. Exploration is not free: more runs, more reading, and a comparison step you cannot skip. It earns its cost when the decision is expensive to reverse and the solution space is genuinely open — a core flow redesign, a density or navigation choice, a refresh with hard constraints, or anything the team will relitigate unless the alternatives are on record. Skip it when there is really only one viable answer, or when the question is small enough to just build and look at. Here is the test: if you cannot name three stances a reasonable designer might defend, you probably do not need an exploration.

Slide 4Direction axes: where real difference comes from

So where does real difference come from? From structural axes — the dimensions where reasonable designers genuinely disagree. Information density: do experts get dense tables, or does everyone get a summary dashboard with drill-down? Navigation: a guided linear flow, a single page, a hub with spokes? Disclosure: everything visible, or details on demand? Starting point: blank slate, template, or the user's own data? Each direction commits to a different position on one or two of these. Tone and visual style come last, never first, because they are the easiest way to fake difference. Quick test: sketch each direction's main screen as boxes. Same boxes, same direction.

Slide 5Named directions beat 'make it different'

Here is a pattern worth stealing from the open-source world. Huashu Design, a widely used design skill for coding agents, includes a direction advisor for vague briefs. It keeps twenty design philosophies grouped into five named schools, and when you ask for style options it deliberately recommends three philosophies from three different schools — spread is guaranteed by construction, not by hoping the model varies. Each direction comes with committed characteristics, and demos are generated in parallel for the human to pick or mix. The transferable lesson: name each direction, commit its characteristics up front, and never ask an agent for 'something different'. A named stance is enforceable. An adjective is not.

Slide 6One brief, three stances

Here is the briefing structure that makes an exploration work. Every direction agent receives exactly the same brief: the user, the job, the constraints, the brand rules, the anti-patterns, and the scoring rubric. The only thing that differs is the stance — a one-paragraph design position the direction must fully commit to, even where it creates trade-offs. In this onboarding example, direction A is a guided linear flow, one decision per screen. B is a single page with progressive disclosure. C is template-first. Shared constraints make the outputs comparable. Distinct stances make them different. You need both, and they are both written before any generation starts.

Slide 7Running directions in parallel

Here is the whole shape of the workflow. On the left, the human work: one brief and three named stances. The brief fans out — one drafting agent per stance, running in parallel, and crucially in isolation. Each agent sees only the shared brief and its own stance, never the other directions, and writes its package to its own file. That isolation is the mechanism that keeps the directions different. Then the critiques cross over: every direction is reviewed by agents that did not write it, scored against a rubric, with the strongest objection recorded. The comparison board deliberately does not pick a winner. That last step — the decision — stays human.

Slide 8The comparison board

Convergence needs a board, and the board needs a rubric — because critique without one collapses into taste, and agents are very good at confident taste. Name five to seven criteria, weight them, and require evidence for every score: a citation from the direction package or the brief. In this example, job completion and constraint fit carry the most weight, and constraint violations do not score low — they disqualify. The report that comes out has the scores, the strongest objection against each direction, blend candidates, and the open questions. One more rule: scores are comparable within a single run, not across briefs. The board informs the decision. It is not the decision.

Slide 9Converging without a committee design

Now the convergence — and the trap inside it. The comfortable move is to say all three have merit, let's blend the best of each. That produces a committee design: a structure nobody proposed and no critique ever examined. The discipline is different. Pick one direction as the spine. Then transplant only named elements from the others — the specific blend candidates the comparison report identified. If a direction's strongest objection is fixable rather than fatal, send it back for another pass instead of forcing the choice. And whatever you decide, record it: the decision, the reasons, and the rejected directions with their scores. That page is what stops the argument restarting next quarter.

Slide 10Presenting directions without anchoring on polish

A word on presenting this work, because stakeholders choose with their eyes. If one direction looks more finished than the others, it has already won — for the wrong reason. So hold every direction to the same fidelity: same components, same data, same hours of polish. Lead with the stance and the trade-off, not the screens. Put the comparison board in the room, not in an appendix. Say out loud what each direction sacrifices, and what every prototype fakes — the fidelity declarations from Module 1 apply to all of them equally. And keep your recommendation separate from the evidence, clearly labelled. Equal polish, stance first, evidence visible.

Slide 11Worked example: three directions for one onboarding flow

Let's trace a real run. A B2B team with forty-one percent onboarding completion ran the three stances we briefed earlier. Drafting took about twenty-two minutes; the six cross-critiques, fifteen more. The critiques did real work. Guided linear scored highest on job completion, but its critics showed the legal step blew it out to eight screens. Progressive disclosure looked cheap to ship, but two critiques caught that it hid the teammate invite — the step most tied to retention. Template-first was disqualified outright: it put content creation before the legal data-residency choice. The lead chose guided linear, transplanted the template gallery as the final step, cut two low-value screens, and recorded the whole decision on one page.

Slide 12Exercise: define three genuinely different directions

Your turn. Take the brief you wrote in Module 2, or a live task, and design the exploration on paper first. Step one: decide whether it deserves an exploration at all, and write one sentence saying why or why not — a no is a perfectly good answer. If yes: pick one or two structural axes and write three one-paragraph stances a reasonable designer could defend. Sketch each direction's main screen as boxes and check the boxes actually differ. Write the rubric — five to seven weighted criteria, at least one disqualifying constraint. Name your scratch folders. If it is worth running, run it before Module 4, because the direction you choose becomes the reference there.

Slide 13Summary, and the bridge to parity

Let's close. Agent-generated options converge for structural reasons, so the fix is structural: named stances, drafted in parallel and in isolation, critiqued adversarially by non-authors, and scored against a rubric where every score needs evidence and constraint violations disqualify. A human converges — picking a spine, blending only named elements, and recording the decision and the rejected directions. And remember this is an escalation: most tasks still want one well-briefed run. In Module 4 the question flips. You have a chosen direction; now the implementation has to match it. Parity, measured with screenshots rather than asserted — that is next.