Design Tokens Are Agent Instructions

Section 01

Agents need decisions, not vibes

When an agent produces a generic interface, the prompt is often blamed first. Sometimes the prompt is weak. More often, the project has not exposed its design decisions in a form the agent can use.

A designer might say the product should feel calm, dense, premium, operational, or editorial. Those words help with direction, but they do not tell an agent which foreground color to use on a warning badge, how tight a table row should be, what radius belongs on a card, or when a shadow is forbidden.

Design tokens turn those decisions into inspectable project material. They do not replace taste. They make taste easier to apply, test, and repeat.

Section 02

What tokens give the agent

A token is a named design decision. It can describe a color, spacing step, type scale, radius, shadow, breakpoint, motion duration, or component-specific value. The useful part is not only the value. The useful part is the name, type, description, and relationship to other tokens.

The Design Tokens Community Group format exists so design decisions can move between tools with a shared vocabulary. Style Dictionary can transform token input into platform outputs. Tokens Studio helps teams manage token sets and themes. Tailwind CSS v4 exposes theme variables that can make tokens available as CSS variables and utility classes.

For agentic design, the practical point is simple: tokens should be close enough to the codebase that an agent can inspect them before it creates UI, and structured enough that the agent can distinguish primitive values from semantic decisions.

Primitive tokens answer: what raw values exist?
Semantic tokens answer: what should this value mean in the product?
Component tokens answer: what should this component use in a particular state?
Theme tokens answer: how does the same decision change by brand, mode, or context?
Generated tokens answer: what does the application actually consume at runtime?

Projects to inspect

Design Tokens Community GroupThe community specification for a shared design-token format and vocabulary.Style DictionaryOfficial documentation for design tokens as platform-agnostic design decisions.Tokens StudioDocumentation on token anatomy, token sets, and how token decisions move through design tools.Tailwind theme variablesTailwind documentation for theme variables as the token layer that drives generated utilities.

diagramToken pipeline for agentic design

Stage 1

Source tokens

Stage 2

Generated CSS variables

Stage 3

Components and screens

Stage 4

Agent visual QA

Review gate

Token names match usage

Review gate

Runtime values match source

Review gate

Screenshots prove the result

The agent should read source tokens and design notes before generating UI, then verify the generated interface against runtime tokens and screenshots.

Section 03

Case study: a dashboard keeps drifting

Imagine a support-operations dashboard. The design system says the interface should be compact and work-focused. The product has a queue table, filters, status chips, risk indicators, and a detail panel. Every agent run produces something plausible, but the details drift.

One version uses a soft blue for every important state. Another adds large rounded cards around each section. Another makes warning states orange but uses the same color for overdue, blocked, and escalated tickets. None of these outputs is obviously broken, but each one weakens the product.

The fix is not to write a longer paragraph asking for better taste. The fix is to expose the actual decisions: status colors, table density, radius scale, type scale, surface hierarchy, and allowed component states.

screenshotToken drift review board

Review boardreference · implementation · state

Warning color reused as decoration

Table row height too loose

Status chip ignores semantic token

Card radius larger than system rule

Mobile order hides active queue

Dark mode contrast not verified

A token-backed review compares the generated interface to the intended hierarchy, states, density, and runtime values.

Section 04

Project file structure for agent-readable tokens

Token files should not be scattered across chat history, Figma comments, and untracked screenshots. Put the source of truth in the project, put generated output where the application consumes it, and put agent instructions beside the files the agent needs to inspect.

The exact folders can change by stack, but the ownership should stay clear. Source tokens are edited deliberately. Generated files are built. Examples teach taste. Review scripts prove that the implementation did not drift.

Token-aware project structure

my-product/
├── DESIGN.md
├── AGENTS.md
├── CLAUDE.md
├── tokens/
│   ├── README.md
│   ├── primitives.tokens.json
│   ├── semantic.tokens.json
│   ├── component.tokens.json
│   └── themes/
│       ├── light.tokens.json
│       └── dark.tokens.json
├── src/
│   ├── styles/
│   │   ├── globals.css
│   │   └── tokens.css
│   └── components/
│       ├── ui/
│       └── dashboard/
├── examples/
│   ├── screenshots/
│   │   ├── dashboard-good.png
│   │   ├── dashboard-drift.png
│   │   └── mobile-queue-order.png
│   └── token-reviews/
│       ├── good-status-chip.md
│       └── bad-status-chip.md
├── agent-workflows/
│   └── token-audit/
│       ├── brief.md
│       ├── checklist.md
│       └── report-template.md
└── scripts/
  ├── build-tokens.js
  ├── verify-token-usage.js
  └── capture-dashboard-screenshots.js

Section 05

Primitive, semantic, and component tokens

Primitive tokens are raw ingredients. They are useful, but they are not enough for agents. If an agent only sees blue-500, gray-900, and spacing-4, it still has to guess when each value belongs.

Semantic tokens carry product meaning. Component tokens carry local implementation rules. A status chip should not ask the agent to choose from the palette. It should point to a status token with a name that explains the state.

DTCG-style token example

{
"color": {
  "primitive": {
    "red": {
      "600": {
        "$type": "color",
        "$value": "#dc2626",
        "$description": "Base red used only through semantic aliases."
      }
    }
  },
  "status": {
    "escalated": {
      "foreground": {
        "$type": "color",
        "$value": "{color.primitive.red.600}",
        "$description": "Text and icon color for escalated work that requires immediate attention."
      }
    }
  }
},
"component": {
  "statusChip": {
    "radius": {
      "$type": "dimension",
      "$value": "{radius.sm}",
      "$description": "Status chips stay compact and should not use pill styling in dense tables."
    }
  }
}
}

Section 06

Make DESIGN.md explain the token rules

Token files give agents values. DESIGN.md gives agents judgment. It should explain which tokens are preferred, which are dangerous, and what mistakes keep happening.

Do not paste the whole token file into DESIGN.md. Reference the source files and write the rules an agent needs when it is deciding between valid options.

DESIGN.md token guidance

## Token Use

Read tokens/semantic.tokens.json before creating UI.

Use semantic tokens for product meaning:
- color.status.escalated.foreground for escalated work
- color.status.blocked.foreground for blocked work
- color.surface.panel for dashboard panels
- spacing.table.row.compact for queue density

Do not use primitive palette values directly in components unless creating a new semantic token.

Dashboard density:
- Queue rows should use the compact row rhythm.
- Status chips use component.statusChip.radius.
- Do not wrap the dashboard in decorative cards.
- If a new state needs a color, propose the semantic token first.

Section 07

Good vs bad token instructions

Bad token guidance gives the agent a color palette and asks it to make something consistent. Good token guidance explains meaning, ownership, and review criteria.

This distinction matters because agents are good at following explicit constraints. If the token rule is vague, the agent will still produce a polished result, but it may not belong to the product.

tableGood vs bad token instructions

1Bad: use our blue palette

Good: use color.action.primary only for primary actions

2Bad: make cards consistent

Good: dashboard panels use radius.panel.default and no decorative shadow

3Bad: use warning colors where needed

Good: warning is for recoverable risk; escalated is for immediate attention

4Bad: keep spacing tight

Good: queue rows use spacing.table.row.compact and must show eight rows at 1440px

5Bad: match the design system

Good: run verify-token-usage and attach desktop and mobile screenshots

Useful token instructions tell the agent what decision the token represents, where it applies, and how to verify it.

Section 08

Prompt with tokens, then review with tokens

A token-aware prompt should not simply say read the design system. It should name the files, state the user job, tell the agent when it can propose a new token, and require evidence after implementation.

The review should use the same token contract. If the implementation uses primitive values directly, creates one-off colors, or ignores component tokens, the agent should report that as design drift.

Token-aware implementation prompt

Use the design token contract before changing UI.

Task: tighten the support dashboard queue table.
User job: support leads need to scan risky tickets and assign work quickly.

Read first:
- DESIGN.md
- tokens/README.md
- tokens/semantic.tokens.json
- tokens/component.tokens.json
- examples/token-reviews/good-status-chip.md

Rules:
- Use semantic tokens for status meaning.
- Do not use primitive palette values directly in component code.
- Do not invent new colors, shadows, or radius values.
- If a missing token is required, propose the token before using it.

Output:
1. short implementation plan
2. changed files
3. token usage notes
4. desktop and mobile screenshots
5. remaining token risks

Section 09

Generate runtime tokens the agent can inspect

The agent should be able to inspect both source tokens and runtime output. If the application uses CSS variables, generated token files should be readable and predictable. That makes it easier for the agent to check whether component code is using the system instead of one-off values.

Tailwind v4 theme variables, CSS custom properties, or generated files from a token build step can all work. The key is to keep generated output traceable to source tokens and to tell agents which files are source and which files are generated.

Runtime token output

/* src/styles/tokens.css - generated from tokens/ */
:root {
--color-surface-panel: #ffffff;
--color-status-escalated-foreground: #dc2626;
--space-table-row-compact: 0.625rem;
--radius-status-chip: 0.25rem;
}

[data-theme="dark"] {
--color-surface-panel: #111827;
--color-status-escalated-foreground: #f87171;
}

.status-chip[data-status="escalated"] {
color: var(--color-status-escalated-foreground);
border-radius: var(--radius-status-chip);
}

Section 10

Add a token review gate

A token review gate catches the quiet failures: the interface looks fine, but the agent used raw hex values, duplicated spacing, created a new shadow, or treated all status colors as decoration.

The review should be mechanical where possible and visual where necessary. A script can find raw hex values. A screenshot review can catch hierarchy and density drift. A human designer still decides whether the final interface expresses the product correctly.

diagramToken review gate

Design decision

Read token source

Design decision

Inspect component usage

Design decision

Check generated CSS

Design decision

Capture screenshots

Design decision

Report token drift

Design decision

Approve or revise

A token review checks source files, component usage, generated CSS, screenshots, and the written change report before accepting the work.

Section 11

Case study: token drift in a billing settings page

A billing settings page looked close to the approved design, but the details kept feeling disconnected. The agent had used the right component library, yet the implementation used raw red for destructive actions, a one-off card shadow, arbitrary table padding, and a rounded pill status treatment that did not exist anywhere else in the product.

The design problem was not that the agent lacked taste. It lacked executable token instructions. Once the token contract named destructive actions, compact table density, card elevation, and status-chip radius, the same prompt produced an interface that looked less dramatic and more like the product.

screenshotToken drift evidence board

Review boardreference · implementation · state

Before: raw destructive red

Before: one-off card shadow

Before: arbitrary table padding

After: semantic danger token

After: card shadow token

After: compact table row token

The useful review is not a beauty contest. It points to concrete token drift: color, radius, spacing, and shadow.

Section 12

Add a small token verification script

A script cannot decide whether a design has good taste, but it can catch the violations that should never reach review: raw colors, arbitrary spacing, duplicated shadows, and local status styles. This gives the agent immediate feedback before a human spends attention on the page.

Keep the script narrow at first. It should report likely drift, not block every legitimate utility class. The goal is to make the agent explain exceptions and fix obvious violations.

verify-token-usage.js sketch

const fs = require("node:fs")
const path = require("node:path")

const targets = ["src/app", "src/components"]
const rawValuePatterns = [
  /#[0-9a-fA-F]{3,8}/,
  /rgb\(/,
  /hsl\(/,
  /shadow-\[/,
  /rounded-\[/,
  /p-\[/,
  /m-\[/,
]

function walk(dir) {
  return fs.readdirSync(dir, { withFileTypes: true }).flatMap((entry) => {
    const full = path.join(dir, entry.name)
    return entry.isDirectory() ? walk(full) : full
  })
}

const findings = []
for (const target of targets) {
  for (const file of walk(target).filter((name) => /\.(tsx|ts|css)$/.test(name))) {
    const text = fs.readFileSync(file, "utf8")
    rawValuePatterns.forEach((pattern) => {
      if (pattern.test(text)) findings.push({ file, pattern: String(pattern) })
    })
  }
}

console.table(findings)
process.exit(findings.length ? 1 : 0)

Section 13

Good vs bad token reports

A token report should not say the page uses some custom colors. It should name the semantic mistake, the affected file, the likely token, and whether the fix is mechanical or needs design review.

tableToken report quality comparison

1Bad

Colors are inconsistent

2Good

Billing danger action uses #dc2626 instead of --color-action-danger-foreground

3Bad

Spacing is off

4Good

Invoice table uses p-4 rows instead of --space-table-row-compact, reducing visible invoices

5Bad

Cards look different

6Good

Payment method card uses shadow-lg while settings cards use --shadow-card-flat

Good token findings are specific enough to fix without losing the design reason.

Section 14

Accessibility and compliance limits

Tokens can support accessibility, but they do not guarantee it. A color token might be named correctly and still fail contrast in a particular component state. A spacing token might be consistent and still make a mobile task order harder to use.

Be careful with claims. Do not let the article, the prompt, or the agent report say a design is accessible unless the relevant checks were actually performed. Prefer precise language: contrast checked for these states, focus styles inspected on these components, mobile reading order reviewed on this viewport.

For public examples, avoid implying legal or accessibility certification. The workflow can flag risks and create review evidence. It is not a substitute for a qualified accessibility or legal review where one is required.

Section 15

Reusable token audit workflow

Use this workflow when an agent-made interface looks polished but slightly off. The goal is to make the agent prove that it used the design system rather than merely imitating it.

The workflow is deliberately boring: read the token contract, inspect implementation, capture screenshots, list drift, fix in passes, and leave a report. That boring loop is what turns tokens from documentation into working agent instructions.

Token audit prompt

Audit this UI for design-token drift.

Inputs:
- DESIGN.md
- tokens/
- src/styles/tokens.css
- target component or route
- desktop and mobile screenshots

Check:
1. raw colors, spacing, radius, shadows, or typography values in component code
2. semantic token misuse
3. missing component tokens
4. status, focus, hover, disabled, loading, and error states
5. density and responsive behavior against DESIGN.md

Output:
- pass/fail summary
- token drift findings
- screenshots reviewed
- proposed fixes
- tokens that should be added or renamed
- risks requiring human design review

Design Tokens Are Agent Instructions

Agents need decisions, not vibes

What tokens give the agent

Case study: a dashboard keeps drifting

Project file structure for agent-readable tokens

Primitive, semantic, and component tokens

Make DESIGN.md explain the token rules

Good vs bad token instructions

Prompt with tokens, then review with tokens

Generate runtime tokens the agent can inspect

Add a token review gate

Case study: token drift in a billing settings page

Add a small token verification script

Good vs bad token reports

Accessibility and compliance limits

Reusable token audit workflow

Keep reading on Design systems.

Build a Design Harness Before You Prompt

Design System Audits With Agents

Design-as-Code: Tokens, .pen, .op and the Diffable Design File

Get the next token workflow, design-system audit prompt, and agent QA checklist by email.

For deeper reading, explore the books behind the Agentic Design School curriculum.

The Agentic Designer

Claude Code for Designers

Open Design