Agentic Design School

Section 01

Drift is usually quiet

Design-system drift rarely starts as a dramatic redesign. It starts as a one-off color, a copied component, a custom radius, a missing focus state, or a table density decision that never made it back into the system.

The problem is not only visual inconsistency. Drift changes product meaning. A warning badge that is orange in one place and red in another teaches users two different languages. A disabled button that looks like a secondary button creates uncertainty. A copied modal that omits focus management turns a design-system shortcut into an accessibility regression.

Agents can help because they can scan files, compare screenshots, search for raw values, and turn scattered inconsistencies into structured findings. But they need a strong audit frame. Without it, the output becomes vague commentary: buttons are inconsistent, spacing needs polish, colors vary. That is not an audit. An audit needs evidence, severity, ownership, and a decision path.

Section 02

Define the audit surface

Do not ask an agent to audit the whole product at once unless the product is tiny. A useful audit starts with a narrow surface: status chips, dashboard tables, settings forms, navigation, empty states, dialogs, or the onboarding flow.

The audit surface should be small enough that the agent can inspect every relevant file and state, but important enough that fixing it changes real product quality. A single component family used in many places is often a better target than an entire page category.

The best scope names the component family, the routes where it appears, the source of truth, the screenshots to compare, and the allowed output. The agent should find and classify drift. It should not silently redesign the system while it audits.

Good scope: audit status chips across dashboard, project detail, mobile queue, and settings pages.
Weak scope: audit our app design and make it consistent.
Good output: prioritized findings with evidence, affected files, and recommended owner.
Weak output: broad notes about needing cleaner spacing and more consistent colors.

diagramDesign-system audit workflow

Design decision

Scope

Design decision

Collect files

Design decision

Collect screenshots

Design decision

Inspect tokens

Design decision

Compare states

Design decision

Prioritize fixes

Design decision

Assign owner

A useful audit moves from scope to evidence, then turns drift into prioritized findings and a fix plan.

Section 03

The four layers of drift

Most design-system audits fail because they only inspect one layer. A file search can find raw hex colors, but it cannot tell whether the UI still communicates the right priority. A screenshot review can see inconsistent cards, but it may miss that two implementations use different components under the hood.

Ask the agent to inspect drift in four layers: meaning, behavior, implementation, and presentation. Meaning drift changes what users understand. Behavior drift changes how the component works. Implementation drift changes whether the system can maintain it. Presentation drift changes polish, rhythm, and visual consistency.

tableDrift layer matrix

1Meaning

The same visual pattern communicates different product states

2Behavior

Keyboard, focus, loading, disabled, or error states differ

3Implementation

A local copy bypasses shared components or tokens

4Presentation

Spacing, radius, icon size, label case, or density varies

The same visible inconsistency can have different causes and different owners.

Section 04

Case study: status chips drift

A product has status chips for active, blocked, escalated, paused, and completed work. Over time, different pages implement them differently. Some use uppercase labels. Some use pill radius. Some use raw colors. Some omit icons. Some skip disabled and loading states. Mobile compresses the label in one route but wraps it in another.

The agent audit should not simply say the chips are inconsistent. It should identify where the inconsistency lives, whether it is visual, semantic, behavioral, or implementation-level, and what should be fixed first.

The team should also decide which differences are intentional. A compact table chip may need less padding than a detail-page chip, but the semantic state, accessible name, color token, and icon logic should still come from the system. An audit is not a demand that every instance look identical. It is a process for separating intentional variation from accidental drift.

screenshotComponent drift evidence board

Review boardreference · implementation · state

Dashboard status chip

Settings status chip

Mobile status chip

Disabled chip

Escalated chip

Component source

A screenshot board helps the agent compare the same component across product surfaces before writing findings.

Section 05

Build an audit packet

An audit packet gives the agent everything it needs without forcing it to rediscover context. Keep the packet in the repo so future audits can compare against past findings.

The packet should include source-of-truth files, screenshots, route list, component inventory, token references, and any known exceptions. If a designer intentionally approved a variant, write that down. Otherwise the agent may try to normalize a useful difference.

Design-system audit packet

my-product/
├── DESIGN.md
├── AGENTS.md
├── tokens/
│   ├── colors.css
│   ├── spacing.css
│   └── radius.css
├── src/components/
│   ├── status-chip.tsx
│   └── ui/
├── examples/screenshots/
│   └── status-chip-audit/
│       ├── dashboard-desktop.png
│       ├── dashboard-mobile.png
│       ├── settings.png
│       ├── disabled-state.png
│       └── source-notes.md
└── agent-workflows/
    └── design-system-audit/
        ├── scope.md
        ├── component-inventory.md
        ├── findings.md
        ├── fix-plan.md
        └── follow-up.md

Section 06

Search code with a design-system lens

Agents are good at repetitive code searches, but the search terms need to match the kinds of drift you expect. Ask for semantic token usage, raw values, copied component names, local utility classes, and state-specific code paths.

For Tailwind projects, raw classes are not automatically wrong. A product may use utilities as the design language. The audit should distinguish semantic token violations from normal composition. A raw `rounded-full` on a badge might be correct if the design system defines pills that way. A raw hex color inside a route is usually more suspicious.

Code inspection checklist

Search for:
- raw colors: "#", "rgb(", "hsl("
- one-off sizing: arbitrary Tailwind values like w-[, h-[, p-[
- duplicated components: StatusChip, Badge, Pill, Tag, StateLabel
- token bypasses: text-red, bg-orange, border-blue outside shared components
- missing states: disabled, loading, error, aria-label, aria-describedby
- local exceptions: comments or props that explain intentional variation

Section 07

Compare screenshots for behavior, not just appearance

A screenshot board should compare the same component across routes and states. Desktop-only screenshots are not enough. Many design-system failures appear at the edges: mobile wrapping, disabled state contrast, empty-state copy, focus rings, density inside data tables, or icon alignment when labels change length.

Ask the agent to describe observable differences before it recommends fixes. This keeps the audit anchored in evidence and prevents the model from turning preferences into findings.

Capture at least one desktop and one mobile viewport for each high-use surface.
Capture normal, hover or focus, disabled, loading, empty, and error states where relevant.
Ask for observable differences first, then severity, then recommended fix.
Separate intentional responsive adaptation from accidental layout damage.

Section 08

Use severity to protect product meaning

A design-system audit should not treat all drift equally. Some drift breaks meaning. Some breaks accessibility. Some only creates polish debt. Severity keeps the team from spending a day tuning border radius while a warning state is unreadable on mobile.

tableAudit finding severity table

1P0 meaning break

The same state means different things on different pages

2P0 accessibility break

Contrast, labels, focus, or keyboard behavior blocks use

3P1 behavior drift

Loading, disabled, error, or responsive state works differently

4P1 component drift

A copied local implementation bypasses the shared component

5P2 token drift

Raw values or one-off variables replace semantic tokens

6P3 polish drift

Spacing, radius, icon size, or label case differs without harming use

Severity helps the team fix drift in the order that protects users and product meaning.

Section 09

Good vs bad findings

A bad finding names a symptom without evidence. A good finding tells the team what changed, where it appears, why it matters, and what kind of fix is appropriate.

The wording matters because audits often create downstream engineering tickets. A ticket that says make chips consistent invites subjective rework. A ticket that says replace local escalated status chip on `/projects/[id]` with shared `StatusChip` because it uses a conflicting color token is actionable.

tableFinding quality comparison

1Bad

Status chips are inconsistent

2Good

Escalated chip uses raw red on project detail while dashboard uses semantic warning token

3Bad

Buttons need polish

4Good

Secondary destructive action uses primary button styling in billing settings, increasing risk

5Bad

Mobile looks off

6Good

At 390px, status label wraps before icon, changing scan rhythm in queue table

Audit findings should be specific enough to become fix tickets without losing design context.

Section 10

Audit prompt

The audit prompt should ask for findings, not fixes. Fixing comes after prioritization. The agent should report uncertainty instead of inventing a system rule.

Design-system audit prompt

Audit this product surface for design-system drift.

Scope:
[component family or product surface]

Inputs:
- DESIGN.md
- AGENTS.md
- token files
- shared component source files
- route files where the component appears
- screenshots of each usage and state

Check:
1. semantic meaning
2. token usage
3. shared component usage
4. accessibility states
5. responsive behavior
6. copy and label consistency
7. intentional exceptions

Output:
- finding
- evidence
- severity
- affected files or screenshots
- likely source of drift
- recommended fix owner
- whether the fix needs human design review

Do not rewrite code during the audit.

Section 11

Turn findings into a fix plan

A design-system audit is useful only if findings become a fix plan. Group findings into quick fixes, component refactors, token changes, documentation updates, and human decisions.

Do not let the agent quietly rewrite the design system during an audit. Audits find drift. Designers decide which changes become system rules. Engineers decide how the shared implementation should change. The agent can prepare the evidence and draft the plan, but the ownership decision belongs to the team.

Quick fixes: replace raw token, restore missing label, correct copied class.
Component refactors: consolidate duplicate implementations or add a missing variant.
Token changes: add semantic tokens only when the current system cannot express a real need.
Documentation updates: record approved variants and responsive behavior.
Human decisions: resolve conflicts where the system itself is unclear.

Section 12

Reusable audit workflow

Use this workflow when a component family has spread across the product and the team needs evidence before refactoring. The goal is not to make everything identical. The goal is to make the system intentional again.

Design-system audit workflow

1. Pick one component family or product surface.
2. Collect source-of-truth files: DESIGN.md, tokens, shared components.
3. List every route and local component where the pattern appears.
4. Capture screenshots across desktop, mobile, and important states.
5. Ask the agent for observable drift only.
6. Classify each finding by meaning, behavior, implementation, or presentation.
7. Assign severity and owner.
8. Approve the fix plan before implementation.
9. Fix in passes: P0/P1 first, then consolidation, then polish.
10. Recapture screenshots and update the design-system notes.

Design System Audits With Agents

Drift is usually quiet

Define the audit surface

The four layers of drift

Case study: status chips drift

Build an audit packet

Search code with a design-system lens

Compare screenshots for behavior, not just appearance

Use severity to protect product meaning

Good vs bad findings

Audit prompt

Turn findings into a fix plan

Reusable audit workflow

Keep reading on Design systems.

Build a Design Harness Before You Prompt

Design Tokens Are Agent Instructions

Design-as-Code: Tokens, .pen, .op and the Diffable Design File

Get the design-system audit checklist by email.

For deeper reading, explore the books behind the Agentic Design School curriculum.

The Agentic Designer

Claude Code for Designers

Open Design