AAgentic Design School

Design Critique Loops With Agents

A practical workflow for turning agent critique from vague opinion into structured findings, evidence, severity, revision passes, and human design decisions.

Last reviewed2026-06-01

Section 1

Critique is not a vibe check

Agents are useful critics when the critique is constrained. If you ask for general feedback, you usually get general taste language: clean, modern, inconsistent, confusing, or needs polish.

A useful critique names the user job, checks the artifact against a brief, points to evidence, assigns severity, and proposes the smallest next revision. The agent should not behave like a judge handing down taste. It should behave like a reviewer helping the designer make the next decision.

The designer still owns judgment. The agent owns inspection, comparison, consistency checking, and summarizing evidence. That division is what keeps critique practical instead of performative.

Section 2

The critique loop

A critique loop starts before the artifact exists. The brief defines what success means. The artifact gives the agent something concrete to inspect. The evidence packet gives the agent screenshots, states, copy, and implementation details. The critique produces findings. The revision pass fixes only the approved findings.

This is slower than asking for instant feedback, but faster than arguing with vague comments. The loop produces a trail of design reasoning: what the design was trying to do, what evidence was inspected, what failed, what changed, and what still requires human judgment.

diagramCritique loop for agentic design
Step 1

Brief

Step 2

Artifact

Step 3

Evidence packet

Step 4

Critique findings

Step 5

Revision pass

Step 6

Human approval

feeds next cycle

The loop works because the agent critiques against evidence and constraints, not against a generic idea of good design.

Section 3

Case study: checkout step review

Imagine a checkout flow for a small software product. The team has a three-step purchase path: plan selection, account details, and payment. The current prototype looks good, but trial users hesitate on the payment step.

A weak critique asks the agent whether the page is clear. A strong critique asks the agent to inspect the checkout against a concrete user job: a buyer should understand what they are paying for, what happens after payment, and how to recover from errors.

The agent receives screenshots for desktop and mobile, the route files, the design system, and the error-state copy. It is asked to return findings only, not edit the interface. That constraint matters because critique and implementation use different kinds of judgment.

screenshotCheckout critique evidence board
Review boardreference · implementation · state

Desktop payment step

Mobile payment step

Error state

Loading state

Confirmation state

Plan summary

A critique board gives the agent screenshots and states so it can point to specific evidence.

Section 4

Write the critique contract

The critique contract tells the agent what kind of feedback is allowed. Without it, the agent may redesign the screen, introduce a new direction, or bury the useful findings under polite commentary.

The contract should include scope, evidence, severity levels, output format, and what the agent must not do. For team workflows, it should also identify who approves fixes. An agent can recommend that a payment summary move higher on mobile; a human decides whether that matches product, legal, and revenue constraints.

Critique contract prompt
You are reviewing a checkout flow as a design QA partner.

Do not redesign the page.
Do not write production code.
Do not invent new brand rules.

User job:
- A buyer should understand the plan, price, payment step, and recovery path.

Inputs:
- DESIGN.md
- AGENTS.md
- screenshots/checkout-desktop.png
- screenshots/checkout-mobile.png
- screenshots/checkout-error.png
- src/app/checkout/

Return findings in this format:
- severity: blocker, important, polish, question
- evidence: screenshot, file, or visible UI detail
- issue: what is wrong
- user impact: why it matters
- recommended fix: smallest useful change
- owner: design, engineering, product, or legal review

Section 5

Choose critique dimensions deliberately

Different artifacts need different critique dimensions. A checkout flow needs trust, recovery, price clarity, accessibility, and error handling. A dashboard needs density, scan path, filters, table behavior, and responsive task order. A marketing page needs promise clarity, proof, conversion path, and credibility.

Do not reuse the same generic critique checklist for every design. Ask the agent to critique the dimensions that matter for the user job.

  • Task clarity: can the user tell what to do next?
  • Information hierarchy: does the screen reveal the right information in the right order?
  • Interaction states: do loading, empty, error, disabled, focus, and confirmation states work?
  • Trust and risk: are pricing, permissions, destructive actions, or commitments clear?
  • System consistency: does the design follow tokens, components, density, and copy patterns?
  • Accessibility: can keyboard, screen-reader, low-vision, and mobile users complete the job?

Section 6

Severity levels keep critique useful

A critique without severity becomes a pile of suggestions. The designer has to sort it manually. Severity lets the agent separate blocking user problems from small polish opportunities.

Use few levels. More categories usually make the critique harder to act on. The point is not to create a perfect taxonomy. The point is to decide what should be fixed first, what needs a human decision, and what can wait.

tableCritique severity matrix
1Blocker

The user cannot complete the job or may make a serious mistake

2Important

The user can continue, but comprehension, trust, or speed is harmed

3Polish

The issue affects consistency or quality but does not block the task

4Question

The agent found ambiguity and needs a human decision before recommending a fix

Severity gives the designer a practical triage path instead of an undifferentiated feedback list.

Section 7

What a good finding looks like

A good finding is specific enough to act on and restrained enough to preserve design ownership. It names the evidence, user impact, and smallest fix. It does not turn one observation into a full redesign.

For the checkout case, a useful finding might say that the mobile payment step hides the plan summary below the form, so the user must enter card details without seeing the final price. The recommended fix is to keep a compact sticky plan summary above the payment form or place price confirmation immediately before the payment fields.

Sample critique finding
Important - Payment context is hidden on mobile

Evidence:
At 390px width, the payment form appears before the plan summary. The desktop version keeps price and plan details visible beside the form.

User impact:
A buyer may enter payment details without confirming the plan, price, renewal timing, or included seats.

Recommended fix:
Move a compact plan summary above the payment form on mobile. Include plan name, price, renewal period, and edit link.

Owner:
Design approval for hierarchy; engineering implementation after approval.

Section 8

Good vs bad critique requests

The difference between useful and useless critique is usually visible in the prompt. The weak version asks for taste. The strong version asks for inspection against a job.

tableGood vs bad critique requests
1Bad: make this better

Good: inspect whether the buyer can understand plan, price, and recovery path

2Bad: review the design

Good: compare screenshots against DESIGN.md and checkout user job

3Bad: tell me your thoughts

Good: return blocker, important, polish, and question findings

4Bad: fix everything

Good: propose the smallest next revision for each approved finding

A good critique prompt gives the agent a standard, evidence, and a reporting format.

Section 9

Separate critique from revision

Agents are eager to fix. That is useful after approval and dangerous before it. If the agent critiques and revises in the same pass, the designer loses the chance to decide which findings matter.

Keep the first pass read-only. Review the findings. Reject subjective or low-value comments. Approve the blockers and important issues. Then ask for a revision plan with a narrow scope.

Revision pass prompt
Use only the approved critique findings below.

Approved findings:
- [finding 1]
- [finding 2]

Rules:
- Do not address rejected findings.
- Do not introduce a new visual direction.
- Keep changes scoped to the affected files.
- Preserve the original design intent.
- After changes, list exactly what changed and what still needs review.

Then capture screenshots for the same viewports used in critique.

Section 10

Use multiple critique roles carefully

For substantial work, one agent can inspect accessibility, another can inspect visual hierarchy, and another can inspect implementation consistency. This can improve coverage, but only if the roles are independent and the output is merged by a human or a lead agent.

Do not let multiple agents rewrite the same UI at once. Critique can be parallel. Revision needs ownership.

  • Accessibility reviewer: labels, focus order, contrast, keyboard path, landmarks.
  • Visual reviewer: hierarchy, density, spacing, responsive task order, visual rhythm.
  • System reviewer: component reuse, token usage, file ownership, duplicated primitives.
  • Lead reviewer: merges findings, removes duplicates, assigns severity, asks human questions.

Section 11

Reusable critique workflow

Use this workflow whenever the agent has produced a screen, article, flow, prototype, or implementation that needs design review.

The key is to separate critique from fixing. First collect findings. Then choose which findings matter. Then revise. Then recapture evidence.

Design critique workflow prompt
Run a design critique loop.

Artifact:
[link, file path, screenshots, or route]

User job:
[what the user needs to accomplish]

Design context:
- DESIGN.md
- relevant screenshots
- relevant route or component files
- known constraints or product rules

Review dimensions:
1. user job clarity
2. hierarchy and scan path
3. interaction states
4. copy and labels
5. design-system consistency
6. accessibility risks
7. responsive task order

Output:
- top 5 findings only
- severity for each finding
- evidence for each finding
- user impact
- smallest recommended revision
- questions requiring human judgment

Do not edit files during critique.
Newsletter

Get the next critique rubric and review prompt by email.

The newsletter is the update channel for article revisions, tool changes, and field-tested workflows.

Processed by Buttondown. You can unsubscribe from any email.

Further reading

For deeper reading, see The Agentic Designer and Claude Code for Designers.

The Agentic Designer cover
Curriculum
The Agentic Designer
How AI agents are transforming product design.

The operating model for product designers, design leads, and builders who need to understand what changes when agents join design work.

Claude Code for Designers cover
Curriculum
Claude Code for Designers
A designer's guide to AI-assisted workflows.

A practical guide for designers who want to work directly with coding agents without turning it into a programming manual.