Agentic Design School

Section 01

Why one draft quietly becomes the design

When an agent produces a single concept quickly, that concept tends to win by default. It arrived first, it looks finished, and the cost of asking for genuinely different alternatives feels higher than nudging the one on screen. Six weeks later the team is polishing a direction nobody actually chose.

Asking one agent for three options in one prompt does not fix this. The options share a context window, so they converge: the same layout with three coats of paint. And asking the same agent to critique its own favorite produces praise with a few softening caveats.

This workflow fixes both problems structurally. Each direction is drafted by a separate agent that never sees the others, and the critique stage is written into the orchestration script, so it happens whether or not anyone remembers to ask for it.

Section 02

When to reach for it

Use this when the decision is expensive to reverse and the solution space is genuinely open: a new flow, a redesign, a density or information-architecture choice, a visual refresh. The output is a decision artifact for a design lead, not production code.

Skip it when the direction is already constrained to one viable answer, or when the question is small enough that exploring it costs more than just building it.

Redesigning a core flow where the team disagrees about the right approach.
Choosing a structural direction, such as data density or navigation model.
A visual or brand refresh with hard legal and accessibility constraints.
Any decision likely to be relitigated later unless the alternatives were examined on the record.

Section 03

The orchestration pattern: parallel drafts, adversarial critique

This runs as a dynamic workflow. Including the word workflow in the prompt makes Claude Code write a JavaScript orchestration script that runs in the background: it fans out one drafting agent per direction, waits for all of them, then fans out critique agents, and only the final comparison report comes back into the main conversation. Intermediate drafts and critiques live in script variables and on disk, not in Claude's context, which is what keeps the directions independent.

The script can run up to 16 agents concurrently and up to 1,000 in a run, so five directions plus a full cross-critique matrix is comfortably inside the budget. Workflows are resumable, so a long exploration interrupted after drafting can pick up at the critique stage. Save the working script to .claude/workflows/ and it becomes a reusable slash command; /effort ultracode is the explicit way to ask for this orchestration depth.

The structural guarantee is the point: critique is not a follow-up question a human might forget to ask. It is a stage in the script, with named reviewers and a fixed rubric, and the run does not finish without it.

diagramExploration stages

Design decision

Write the brief

Design decision

Assign design stances

Design decision

Draft directions in parallel

Design decision

Adversarial cross-critique

Design decision

Score against rubric

Design decision

Human decision

Design decision

Record the decision

The brief fans out, the critiques cross over, and a human makes the call.

Section 04

Step 1: write one brief, then assign stances

Every drafting agent receives the same brief: the user, the job, the constraints, the brand rules, and the anti-patterns. What differs is the stance: a one-paragraph design position the direction must commit to, such as progressive disclosure, expert density, or guided linear flow.

Stances are how you guarantee the directions differ in structure rather than in styling. If you cannot name three stances that a reasonable designer might defend, the problem may not need an exploration.

brief.md (excerpt)

# Brief: onboarding flow redesign

## User and job
New workspace admins setting up their first project. They need to reach a working
project with at least one teammate invited, in one sitting.

## Constraints
- Must work on mobile web; native apps are out of scope.
- Uses existing design tokens and the current component library.
- Legal: data-residency choice must appear before any content is created.
- Accessibility: WCAG 2.2 AA; no information conveyed by color alone.

## Anti-patterns
- No more than one optional step; the current flow has four and completion is 41%.
- No fake progress indicators.

## Stances (one per direction agent)
- Direction A: guided linear flow, one decision per screen.
- Direction B: single-page setup with progressive disclosure.
- Direction C: template-first; pick a working example, then customize.

Section 05

Step 2: define the drafting agent

The drafting agent produces a direction package, not a finished design: a concept statement, the screen-by-screen structure, the key interaction decisions, what the direction deliberately sacrifices, and a rough effort note. A consistent package format is what makes the directions comparable later.

.claude/agents/direction-author.md

---
name: direction-author
description: Drafts one complete design direction from a brief and an assigned stance. Used in concept exploration workflows; never sees other directions.
tools: Read, Write, Glob
---

You draft exactly one design direction.

Rules:
- Commit fully to the assigned stance, even where it creates tradeoffs. Name the tradeoffs.
- Stay inside the brief's constraints and anti-patterns; flag any conflict explicitly.
- Output a direction package: concept statement (under 120 words), screen-by-screen
  structure, key interaction decisions, what this direction sacrifices, effort estimate
  (S/M/L), and open questions.
- Do not reference, imagine, or hedge against other directions.
- Write the package to the path you are given, in markdown.

Section 06

Step 3: the orchestration script

The script fans out the drafting agents in parallel, then builds the critique assignments so that every direction is reviewed by agents that did not write it. Each critique agent argues against the direction it reviews: its job is to find where the direction fails the brief, not to balance praise and concerns.

Dynamic workflow sketch (the orchestration script Claude writes)

// Sketch of the orchestration script Claude Code generates for this workflow.
// agent(prompt, options) runs a subagent and resolves with its final answer.
import { readFile, writeFile } from "node:fs/promises"

const brief = await readFile("exploration/brief.md", "utf8")
const stances = ["guided-linear", "progressive-disclosure", "template-first"]

await Promise.all(
  stances.map((stance) =>
    agent(
      "Use the direction-author agent rules. Brief follows. Your stance is " + stance +
        ". Write your direction package to exploration/directions/" + stance + ".md.\n\n" + brief,
      { model: "opus" }
    )
  )
)

// Adversarial critique: every direction is reviewed against the others by a non-author.
const critiques = []
for (const target of stances) {
  for (const reviewerStance of stances.filter((s) => s !== target)) {
    critiques.push(
      agent(
        "Use the direction-critic agent rules. Critique exploration/directions/" + target +
          ".md against exploration/brief.md and scoring-rubric.md, from the perspective of someone who believes the " +
          reviewerStance + " direction is stronger. Score every rubric criterion 1-5 with evidence.",
        { model: "sonnet" }
      )
    )
  }
}
const critiqueResults = await Promise.all(critiques)
await writeFile("exploration/critiques.md", critiqueResults.join("\n\n---\n\n"))

const comparison = await agent(
  "Read exploration/directions and exploration/critiques.md. Produce a comparison report: rubric scores per direction, the strongest argument against each, constraint violations, and the open questions a human must decide. Do not pick a winner.",
  { model: "opus" }
)
await writeFile("exploration/comparison.md", comparison)

Section 07

Step 4: score against an explicit rubric

Critique without a rubric collapses into taste, and agents are very good at producing confident taste. The rubric names the criteria, weights them, and requires evidence from the brief or the direction package for every score.

Keep the rubric short enough that a human can hold it in their head while reading the comparison report. Five to seven criteria is usually right.

scoring-rubric.md

# Scoring rubric

Score each criterion 1-5. Every score must cite evidence from the direction package
or the brief. A 3 means adequate; reserve 5 for directions that make the criterion a strength.

| Criterion | Weight | What a 5 looks like |
|---|---|---|
| Job completion | 3 | A first-time admin reaches a working project with a teammate invited, with the fewest unforced decisions |
| Constraint fit | 3 | No conflicts with legal, accessibility, or platform constraints; conflicts are disqualifying, not a low score |
| Clarity of hierarchy | 2 | Each screen has one obvious primary action and the structure explains itself |
| Effort to ship | 2 | Reuses existing components and patterns; estimate S or M with believable reasoning |
| Differentiation | 1 | The direction is structurally distinct from the current product and from the other stances |
| Risk honesty | 1 | The package names what the direction sacrifices and where it could fail |

diagramRubric scoring pass

Design decision

Read direction package

Design decision

Apply each rubric criterion

Design decision

Cite evidence per score

Design decision

Apply criterion weights

Design decision

Flag constraint violations

Design decision

Feed the comparison report

Every critique walks the same scoring path, so the comparison report ranks evidence rather than confidence.

Section 08

Step 5: the human decision

The comparison report deliberately does not pick a winner. It puts the scored directions, the strongest argument against each, and the unresolved questions in front of a design lead, who chooses, blends, or sends a direction back for another pass.

Record the decision and the reasons next to the direction packages. The rejected directions are part of the value: the next time someone asks why the product is not template-first, the answer is on file with scores and evidence.

tableWhat the comparison report contains

1Score table

Weighted rubric scores for every direction with one-line evidence

2Strongest objection

The single best argument against each direction, taken from the critiques

3Constraint check

Any legal, accessibility, or platform violations, which disqualify rather than score

4Blend candidates

Specific elements worth carrying across directions, named per element

5Open questions

Decisions the brief left unresolved that the human must settle

The report is built for a decision meeting, not for reading cover to cover.

Section 09

Case study: onboarding flow redesign

A B2B team with a 41% onboarding completion rate ran the exploration with the three stances above. Drafting took 22 minutes; the full critique matrix of six reviews took another 15.

The critiques did real work. The guided-linear direction scored highest on job completion but its critics showed it required eight screens to satisfy the legal data-residency step, against four in the brief's spirit. The single-page direction scored well on effort but two critiques flagged that its disclosure pattern hid the teammate invitation, the one step most correlated with retention. The template-first direction drew the strongest objection: templates implied content creation before the data-residency choice, a constraint violation rather than a preference.

The lead chose guided-linear, pulled the template gallery in as the final step, and cut two optional screens the critique had marked as low value. The decision record, including the rejected directions, took one page.

Section 10

Case study: choosing a data-density direction

An analytics product needed to settle a long-running argument about density before a navigation rework. Four stances were drafted: expert-dense tables, progressive drill-down, dashboard-of-summaries, and a hybrid with a density toggle.

The adversarial stage was decisive in an unexpected way. The hybrid direction, which the team had assumed would win as the safe compromise, took the lowest weighted score: three separate critiques showed it doubled the design and QA surface for every future feature, and its own package admitted the toggle existed because the direction would not commit. The expert-dense direction scored highest on job completion for the product's actual daily users, with its accessibility risks itemized and judged fixable.

The team chose expert density, recorded the toggle idea as explicitly rejected, and reported that the decision meeting took 40 minutes instead of the third workshop it had been heading toward.

Section 11

Case study: brand refresh with legal and accessibility constraints

A marketing team explored five visual directions for a brand refresh, with the brief carrying two hard constraints: an industry rule about how comparative claims could be displayed, and WCAG 2.2 AA contrast on all text over imagery.

The critique stage killed two directions early. One relied on light typography over photography and could not meet contrast without abandoning its core idea; another's hero pattern placed comparative claims in a layout the legal reviewer agent flagged as non-compliant in three of its five templates. Both were eliminated before any human spent time falling in love with them.

The remaining three went to the creative director with scores and objections attached. The chosen direction borrowed its color system from one of the eliminated ones, a blend the comparison report had explicitly suggested.

Section 12

Good vs bad exploration output

A weak exploration produces three skins of the same idea and a critique that praises everything. A strong one produces directions that disagree with each other, critiques that draw blood, and a decision a team can stand behind a quarter later.

tableExploration quality comparison

1Bad

Three directions that share one layout with different accent colors

2Good

Three directions with different screen counts, different first decisions, and named sacrifices

3Bad

Critique: all directions are strong, with minor polish opportunities

4Good

Critique: Direction B hides the teammate invite behind disclosure; the brief's retention goal makes this a 2 on job completion

5Bad

Recommendation: blend the best of all three

6Good

Comparison report: scores, strongest objection per direction, two named blend candidates, and four open questions for the lead

The test is whether a director could make a confident call from the report alone.

Section 13

Limits: what the exploration cannot decide

Agents can generate distinct directions and argue about them honestly when the structure forces it. They cannot know which tradeoff the organization should accept, how much appetite exists for a riskier direction, or what users will actually do. Scores are inputs to a human decision, not the decision.

The rubric is also a human responsibility. If the criteria or weights are wrong, the workflow will efficiently rank directions against the wrong standard.

It cannot replace usability testing; a winning direction is still a hypothesis.
It cannot settle taste or brand questions; it can only make the options and tradeoffs explicit.
It cannot detect constraints the brief never mentioned.
Scores between directions are comparable within one run, not across different briefs or runs.

Section 14

Reusable exploration workflow

Save the working script to .claude/workflows/ and the agent definitions to .claude/agents/, and the exploration becomes a command the team runs whenever a decision is big enough to deserve real alternatives.

Multi-direction concept exploration workflow

1. Write one brief: user, job, constraints, brand rules, anti-patterns.
2. Name 3-5 design stances that a reasonable designer could defend.
3. Run as a dynamic workflow: one direction-author agent per stance, drafted in parallel and in isolation.
4. Cross-assign direction-critic agents so every direction is reviewed only by non-authors.
5. Score every direction against the explicit rubric, with evidence required for each score.
6. Generate a comparison report with scores, strongest objections, blend candidates, and open questions.
7. A human picks, blends, or sends a direction back for another pass.
8. Record the decision and the rejected directions next to the packages.

Sources

Multi-Direction Concept Exploration

Why one draft quietly becomes the design

When to reach for it

The orchestration pattern: parallel drafts, adversarial critique

Step 1: write one brief, then assign stances

Step 2: define the drafting agent

Step 3: the orchestration script

Step 4: score against an explicit rubric

Step 5: the human decision

Case study: onboarding flow redesign

Case study: choosing a data-density direction

Case study: brand refresh with legal and accessibility constraints

Good vs bad exploration output

Limits: what the exploration cannot decide

Reusable exploration workflow

Sources & further reading

Get the next critique rubric templates and tool-watch notes by email.

For deeper reading, explore the books behind the Agentic Design School curriculum.

The Agentic Designer

Claude Code for Designers

Open Design