Agentic Design School

Section 01

Why handoff is where design quality leaks

Most design intent is lost between the approved design and the first build, not in the design work itself. The file shows the happy path at one width; the developer has to invent the loading state, guess the keyboard behavior, and pick a spacing value when the design and the token system disagree. Every one of those guesses is reasonable, and together they produce a build that is technically faithful and quietly wrong.

A spec is the document that removes the guessing. The reason most teams do not write them is cost: a thorough spec for one screen takes an afternoon, and a six-screen feature takes a week nobody has. The information is mostly already there — in the design file, the token system, the existing components, and the designer's head — it just has to be extracted, structured, and checked.

That extraction-and-structuring work is exactly what agents are good at, and the designer's job shifts from typing the spec to deciding what the spec should say. The output is a handoff packet a developer can build from without a meeting, and a record of intent the team can return to when the build drifts.

Section 02

When to reach for this workflow

Use it whenever the people building the screens are not the people who designed them and cannot tap the designer on the shoulder: external development teams, offshore or async collaborators, client engineering teams at the end of an agency engagement, or simply a busy internal team where a meeting costs a week of calendar time.

It is an Intermediate workflow: the orchestration is a straightforward fan-out, but the value depends on having approved designs, a token source of truth, and a designer willing to make the calls the agents flag as ambiguous.

A feature being handed to an external or async development team.
A legacy feature that needs to be rebuilt and has no documentation of how it currently behaves.
An agency engagement ending with a handover to the client's engineers.
A design system adoption push where downstream teams keep asking what exactly to build.

Section 03

What a developer-ready screen spec contains

A screen spec is not a prettier annotation layer. It is a structured answer to the questions a developer will otherwise ask in a meeting or answer with a guess: what is the layout at each breakpoint, which tokens map to which surfaces, what every interactive element does in every state, what the screen does with no data and too much data, what assistive technology should announce, and how the team will know the build is done.

The packet adds two things on top of the per-screen specs: a feature-level overview that explains how the screens connect, and an open-questions list of every decision the agents could not make from the evidence. The open-questions list is not a failure; it is the most valuable page in the packet, because it turns silent guesses into explicit decisions with an owner.

tableScreen spec contents

1Layout redlines

Structure, spacing, and alignment per breakpoint, expressed in tokens where possible

2Token mapping

Color, type, spacing, radius, and elevation tokens for every surface and text style

3Interaction states

Default, hover, focus, active, disabled, loading, error, empty for each interactive element

4Accessibility annotations

Roles, names, focus order, announcements, and contrast notes

5Edge cases

Long content, no content, slow network, permission and error variants

6Acceptance criteria

Testable statements the build is checked against, written in plain language

Each section answers a question a developer would otherwise guess at.

Section 04

The orchestration pattern: one agent per screen, then consolidate, then challenge

The workflow runs as a Claude Code dynamic workflow. Including the word workflow in the prompt, or running with /effort ultracode, makes Claude write a JavaScript orchestration script that runs in the background. The script holds the screen list, dispatches one spec agent per screen, collects the drafts in its own variables, and only brings the assembled packet back to the conversation. Up to sixteen agents run concurrently and a run can dispatch up to a thousand, so a six-screen feature finishes its fan-out in the time the slowest screen takes.

Two more agents run after the fan-out. A consolidation agent merges the screen specs into the packet, normalizes terminology, builds the shared-component table, and pulls every unanswered question into one list. Then a developer-perspective agent reads the whole packet as if it had to build it on Monday with no access to the designer, and flags every place it would have to guess. That second pass is what separates a packet that looks complete from one that is.

The run is resumable, so a screen agent that stalls does not cost you the others. Once the prompt and the agents are stable, save the run from /workflows with the s key into .claude/workflows/ so the next feature is a /handoff-spec command instead of a pasted prompt. The agent definitions live in .claude/agents/*.md and are shared by every run.

diagramSpec generation fan-out

Design decision

Gather inputs per screen

Design decision

Fan out one spec agent per screen

Design decision

Consolidate into the packet

Design decision

Developer-perspective review

Design decision

Designer answers open questions

Design decision

Deliver the packet

One agent per screen drafts a spec; consolidation and a developer-perspective review happen before anything is delivered.

Section 05

Step 1: assemble the inputs per screen

Each screen agent needs three things: the approved design for that screen (Figma frames via the Figma MCP server, or exported images plus the design file's layer notes), the token source of truth, and whatever existing component code or documentation is relevant. Put them in a predictable folder per screen so the prompt can name them instead of describing them.

Resist the urge to give every agent everything. The per-screen folder is what keeps each spec grounded in its own evidence and keeps the run fast; shared context like the token file and the component inventory is listed once at the feature level.

Handoff packet folder structure

handoff/onboarding-v2/
  feature-brief.md            # what the feature is for, who builds it, target stack
  tokens.json                 # or a pointer to the token package
  component-inventory.md      # existing components the build should reuse
  screens/
    01-welcome/
      design.png              # or Figma frame link resolved via Figma MCP
      notes.md                # designer notes, prototype links, copy doc
    02-account/
    03-workspace/
    ...
  specs/                      # written by the screen agents
  packet/                     # written by the consolidation agent
    handoff-packet.md
    open-questions.md

Section 06

Step 2: the workflow prompt

The prompt names the inputs, fixes the spec template, and makes the two later passes explicit. The constraint that matters most is the last one: where the design is ambiguous, the agent records a question instead of inventing an answer.

Handoff spec workflow prompt (run once, then save)

Run this as a workflow.

Generate a developer handoff packet for the feature in handoff/onboarding-v2/.

Steps:
1. For each folder under handoff/onboarding-v2/screens/, dispatch one screen-spec agent. Each agent gets its screen folder, tokens.json, component-inventory.md, and feature-brief.md, and writes specs/<screen>.md using the template in templates/screen-spec.md: layout redlines per breakpoint, token mapping, interaction states, accessibility annotations, edge cases, and acceptance criteria.
2. Dispatch a consolidation agent that merges the screen specs into packet/handoff-packet.md: feature overview, shared components table, per-screen specs, and packet/open-questions.md collecting every unresolved question.
3. Dispatch a developer-perspective reviewer that reads the packet as if it had to build it without access to the designer, and adds an Ambiguities section listing anything it would have to guess: missing states, unmapped values, untestable acceptance criteria, conflicts between screens.

Rules: map values to tokens wherever a token exists and flag values that have none; never invent behavior the design does not show — record it as an open question instead. Do not modify any source code in this run.

Section 07

Step 3: the screen-spec subagent

Defining the screen-spec agent once keeps every screen to the same template and the same level of detail, in this run and the next feature's run. It is read-only with respect to the codebase; it writes only its own spec file.

.claude/agents/screen-spec-writer.md

---
name: screen-spec-writer
description: Writes a developer-ready spec for one screen from its approved design, the token system, and the existing component inventory. Records ambiguities as open questions instead of inventing answers.
tools: Read, Grep, Glob, Write
---

You spec exactly one screen. Inputs: the screen folder you are given, tokens.json, component-inventory.md, and feature-brief.md.

Follow templates/screen-spec.md exactly. For every visual value, name the token; if no token matches, list the raw value under "Unmapped values". For every interactive element, cover default, hover, focus, active, disabled, loading, error, and empty states — and mark any state the design does not show as an open question, do not invent it.

Write acceptance criteria as testable statements a developer or QA person could verify without asking the designer.
Write only your spec file under specs/. Do not modify anything else.

Section 08

Step 4: the developer-perspective review

The reviewer's instruction is deliberately adversarial: assume you have to build this next week, the designer is on leave, and every guess you make will be wrong. Anything it cannot build from the packet alone becomes an ambiguity, and ambiguities are the designer's homework before the packet ships.

In practice this pass catches three recurring gaps: states that exist in the prototype but not in any frame, acceptance criteria that restate the design instead of describing a check, and two screens that show the same component behaving differently. Each of those is cheap to fix before handoff and expensive to discover during the build.

Missing or implied states: what happens on slow networks, with zero items, with 40 items.
Untestable acceptance criteria: rewrite anything that cannot be checked without taste.
Cross-screen conflicts: the same component specced two ways on two screens.
Unmapped values: anything that bypasses the token system needs a named decision.

diagramDeveloper-perspective review pass

Design decision

Read packet as builder

Design decision

Walk every interactive element

Design decision

Flag missing states

Design decision

Flag untestable criteria

Design decision

Flag cross-screen conflicts

Design decision

Designer resolves ambiguities

The reviewer reads the packet as a builder with no designer access; everything it would guess becomes the designer's homework.

Section 09

Case study: handing a six-screen onboarding flow to an external team

A SaaS company redesigned onboarding into six screens — welcome, account, workspace setup, invite teammates, plan selection, and a finish state — and contracted an external team in another timezone to build it. Previous external work had averaged two clarification calls per screen, which at a nine-hour offset meant roughly a week of latency per screen.

The fan-out specced all six screens in about 70 minutes; the consolidation and developer review added another 20. The packet ran to roughly 40 pages of markdown, and the developer-perspective pass surfaced 17 ambiguities, including the entire behavior of the invite step when the workspace already had members and what plan the flow should preselect for users arriving from the annual-pricing campaign. The designer answered all 17 in a two-hour working session before anything was sent.

The external team built the flow in three weeks with four clarification questions total, all of them about the company's API rather than the design. The acceptance criteria from the packet were copied directly into the QA checklist, and the post-build design review found two deviations instead of the usual dozens.

Section 10

Case study: retro-speccing a legacy feature with no documentation

A fintech team needed to rebuild a seven-year-old account statements feature on a new stack. There was no design file and no documentation; the spec, in effect, was the production behavior. The same workflow ran in reverse: instead of design frames, each screen agent received screenshots of the live feature in its main states plus the relevant legacy templates, and was asked to spec what the feature currently does, flagging anything that looked like a bug rather than intent.

The agents documented 4 screens and 23 distinct states, including several nobody on the current team knew existed, such as a special header for accounts migrated from an acquired bank. The looks-like-a-bug list had nine entries; the team confirmed six were bugs they chose not to reproduce and three were intentional behavior with regulatory reasons behind them.

The packet became the contract for the rebuild: the new implementation had to match the spec, not the old code. The product owner estimated the discovery work would have taken the team three to four weeks of archaeology; the workflow plus two review meetings took four days.

Section 11

Case study: agency-to-client engineering handover

A design agency was wrapping a five-month engagement: a redesigned booking flow, approved by the client, to be built by the client's in-house engineers after the contract ended. The agency's incentive was sharp — every ambiguity left in the handover becomes either unpaid support email or a misbuilt screen with the agency's name on it.

The team ran the workflow over the eight booking screens, then spent half a day in the open-questions session, which doubled as the final design decisions meeting of the engagement: 21 questions, most of them about error and payment-failure states the prototype had glossed over. The finished packet, including the token mapping against the client's existing design tokens, shipped as part of the final delivery alongside the design files.

Three months later the client's engineering lead reported the flow was built to spec with no agency involvement beyond two emails, and the agency now budgets the workflow into every engagement's final week. The packet has also changed how the agency scopes work: edge cases get designed earlier because everyone knows they will be specced eventually.

Section 12

Good vs bad spec content

The packet is judged by the developer who builds from it, so spot-check it the way they will read it: pick one interactive element and try to build it in your head. If you have to guess, the spec is not done. The contrast below is the difference between a spec and a caption.

tableSpec quality comparison

1Bad

The continue button is blue and sits at the bottom

2Good

Continue uses Button/primary, color.action.primary, full-width below 768 px, fixed to the safe-area bottom with space.4 padding; disabled until both fields validate

3Bad

Show an error if the invite fails

4Good

On invite failure keep entered emails, show feedback.error banner with retry, announce the message via role=alert, and leave focus on the failed field

5Bad

Should look right on mobile

6Good

At 390 px the plan cards stack vertically in the order Pro, Starter, Enterprise, and the comparison table collapses into per-card detail disclosures

7Bad

Acceptance: matches the design

8Good

Acceptance: with zero teammates entered, Skip for now advances to plan selection and the workspace shows the single-member empty state

A buildable spec names tokens, states, and testable criteria; a weak one restates the picture.

Section 13

What the workflow cannot decide

The agents extract and structure intent; they do not own it. Every open question in the packet is a design decision, and shipping the packet with questions unanswered just relocates the guessing from the developer to the agent. Budget the answer session — it is usually one to two hours per feature — as part of the workflow's time, not as optional follow-up.

The packet also cannot guarantee the build will follow it, and it does not replace a relationship with the people building. Pair it with the design QA on every PR workflow so the spec's acceptance criteria are checked against the implementation, and keep one walkthrough call in the plan: the packet makes that call short, it does not make it unnecessary.

It cannot make design decisions; ambiguities go back to the designer with an owner and a date.
It cannot verify the eventual build; pair it with a PR-level design QA gate.
It cannot capture taste and intent that exists only in the designer's head unless the designer writes it into the brief.
It cannot fix an unapproved or unstable design; spec after approval, not instead of it.

Section 14

The reusable handoff workflow

Save the working run to .claude/workflows/ as /handoff-spec, keep the screen-spec template and the agents in version control, and reuse the same packet structure on every feature so developers learn to navigate it once.

Design handoff and spec generation — workflow steps

1. Assemble inputs per screen: approved design, notes, tokens, component inventory, feature brief.
2. Fan out one screen-spec agent per screen using the shared spec template.
3. Consolidate the screen specs into one packet with a shared-components table.
4. Collect every ambiguity into open-questions.md instead of letting agents invent answers.
5. Run the developer-perspective review and add anything it would have to guess.
6. Hold the answer session: the designer resolves open questions, decisions go into the packet.
7. Deliver the packet with the design files; walk the building team through it once.
8. Check the build against the packet's acceptance criteria, ideally via the PR design QA gate.

Sources

Design Handoff and Spec Generation

Why handoff is where design quality leaks

When to reach for this workflow

What a developer-ready screen spec contains

The orchestration pattern: one agent per screen, then consolidate, then challenge

Step 1: assemble the inputs per screen

Step 2: the workflow prompt

Step 3: the screen-spec subagent

Step 4: the developer-perspective review

Case study: handing a six-screen onboarding flow to an external team

Case study: retro-speccing a legacy feature with no documentation

Case study: agency-to-client engineering handover

Good vs bad spec content

What the workflow cannot decide

The reusable handoff workflow

Sources & further reading

Get the screen spec template and the developer-review checklist by email.

For deeper reading, explore the books behind the Agentic Design School curriculum.

The Agentic Designer

Claude Code for Designers

Open Design