Section 1
Why landing page reviews drift into opinion
Most landing page reviews happen in a meeting, in front of the live page, with the loudest reasonable-sounding opinion winning. The hero gets debated for forty minutes, the form with eleven required fields gets thirty seconds, and nobody checks whether the page even fires a conversion event the team can measure against.
The structural problem is that a landing page is several different jobs stitched together — promise, proof, price, ask — and a single walkthrough reviews them all with the same fading attention. By the footer, everyone wants lunch.
This workflow gives each section its own reviewer. Section agents for the hero, social proof, pricing, the form, and the footer each work from the same screenshots and DOM capture, against checklists for message clarity, hierarchy, trust signals, friction, form quality, page speed signals, accessibility, and analytics gaps. The output is not a verdict on the page; it is a prioritized backlog of hypotheses with proposed variants, framed as things to test because that is all an audit can honestly claim.
Section 2
When an agency or marketing team should reach for it
Run it before increasing paid spend onto a page, before a redesign that someone is about to justify with taste, when a page's conversion rate has drifted and nobody can say why, and as the standard first deliverable on a new CRO or performance retainer. At 45 to 90 minutes per page it is cheap enough to run on every page that takes meaningful traffic.
It is a Foundation-level workflow on purpose: one capture script, a handful of section agents, one merge pass. If your team has never run an agent fan-out before, this is a good first one — the inputs are screenshots and a DOM file, and the output is a document a marketer can read without explanation.
- Before scaling paid traffic to a page that has never been audited.
- When a client asks for a redesign and nobody has evidence the current page is the problem.
- As a recurring quarterly check on the pages that carry the most spend.
- After a CMS migration or template change that may have broken analytics or accessibility quietly.
Section 3
The orchestration pattern: one capture, many section agents
This runs as a Claude Code dynamic workflow. Including the word workflow in the prompt — or running /effort ultracode — makes Claude write a JavaScript orchestration script that runs in the background: it ingests the captures, fans out one subagent per page section plus two cross-cutting reviewers (accessibility and analytics), waits for all of them, then runs a merge pass that deduplicates findings and assembles the backlog. The section reports stay in script variables and on disk, not in Claude's context; only the merged backlog comes back to the conversation.
The runtime supports up to 16 concurrent agents and 1,000 per run, so even auditing several pages in one go is well inside the budget, and an interrupted run resumes rather than restarting. Save the working script to .claude/workflows/ in the project (or ~/.claude/workflows/ for personal audits) and the audit becomes a slash command; the section reviewers are markdown definitions in .claude/agents/ that you tune as your checklists improve.
Fanning out per section is not just a speed trick. Each agent reads a small, relevant slice of evidence with a checklist built for that slice, which is what keeps the form review as sharp as the hero review.
Design decision
Capture screenshots and DOM
Design decision
Add analytics and speed evidence
Design decision
Fan out section agents
Design decision
Accessibility and analytics passes
Design decision
Merge and deduplicate findings
Design decision
Prioritize the hypothesis backlog
Design decision
Human picks what to test
One evidence capture feeds every section agent; humans prioritize the merged backlog.
Section 4
Step 1: capture the evidence once
The audit works from artifacts, not from an agent browsing the live page mid-conversation. Capture full-page screenshots at 390 and 1440 pixels, save the rendered DOM to a file, run an axe-core pass, and pull a Lighthouse run for the page speed signals. If the team has GA4 access, export the page's funnel basics — sessions, conversion events, form abandonment if it is instrumented — as a CSV; if it is not instrumented, that absence is itself a finding.
A small Playwright script does all of this in a couple of minutes and, more importantly, makes the audit repeatable: the re-audit after the first round of tests uses the same commands against the same widths.
import { chromium } from "playwright"
import { mkdir, writeFile } from "node:fs/promises"
const url = process.argv[2]
const outDir = "audit/evidence"
await mkdir(outDir, { recursive: true })
const browser = await chromium.launch()
for (const width of [390, 1440]) {
const page = await browser.newPage({ viewport: { width, height: 900 } })
await page.goto(url, { waitUntil: "networkidle" })
await page.screenshot({ path: outDir + "/page-" + width + ".png", fullPage: true })
if (width === 1440) {
await writeFile(outDir + "/dom.html", await page.content())
}
await page.close()
}
await browser.close()
console.log("Captured screenshots and DOM into " + outDir)
// Then, separately:
// npx @axe-core/cli <url> --save audit/evidence/axe.json
// npx lighthouse <url> --output=json --output-path=audit/evidence/lighthouse.json
// Export GA4 page data to audit/evidence/ga4-page.csv (or note that it is missing)Section 5
Step 2: define the section agents
Each section agent gets the same evidence folder and a checklist scoped to its section. The hero agent checks message clarity, the primary call to action, and what is visible before scrolling at both widths. The social proof agent checks whether the proof is specific, recent, and relevant to the audience. The pricing agent checks comprehension and anxiety points. The form agent counts fields, checks labels, error handling, and what happens after submission. The footer agent checks trust, legal, and the escape hatches people actually use.
Two cross-cutting agents run alongside: an accessibility reviewer working from the DOM and the axe report, and an analytics reviewer checking that the events a test program would need actually exist.
--- name: form-section-reviewer description: Reviews the form or signup section of a landing page from screenshots and the DOM. Produces hypotheses, not verdicts. tools: Read, Glob, Grep --- You review only the form / conversion section of the page. Work from audit/evidence/ (screenshots at 390 and 1440, dom.html, axe.json, ga4-page.csv if present). Do not browse the live page or assume content you cannot see. Check: - Number of fields, which are required, and whether each is justified by the offer. - Labels, placeholder misuse, error states, and keyboard accessibility (cross-check axe.json). - Friction: account creation before value, surprise fields (phone, company size), captcha placement. - The promise above the button: does it match what actually happens after submission? - Whether form interactions and submissions are measurable (cross-check the analytics evidence). Output findings in the shared format: section, observation (with the evidence file it rests on), why it matters, a hypothesis phrased as "We believe that [change] will [expected effect] because [evidence]", a proposed A/B variant, and an effort estimate (S/M/L). Never promise a conversion lift; every item is a hypothesis to test.
Section 6
Step 3: the workflow prompt and the orchestration sketch
The prompt below is what a designer or marketer pastes into Claude Code after running the capture script. The sketch underneath shows the shape of the script Claude writes in response — fan out, wait, merge — and is labeled a sketch because the real script is generated per run.
Run a landing page audit workflow for the page captured in audit/evidence/. Inputs: - audit/evidence/page-390.png and page-1440.png, dom.html, axe.json, lighthouse.json, ga4-page.csv (if missing, treat the missing analytics as a finding) - audit/page-context.md (who the traffic is, what the page is supposed to get them to do, and the offer's real constraints) - Section reviewer definitions in .claude/agents/: hero, social-proof, pricing, form, footer, accessibility, analytics Orchestration: 1. Fan out the seven reviewer agents in parallel; each reads only the evidence folder and its own checklist, and writes its findings to audit/sections/<name>.md. 2. Run a merge pass: deduplicate overlapping findings, keep the strongest evidence for each, and assemble audit/hypothesis-backlog.md using the backlog template. 3. Prioritize by expected impact on the page's stated job, weighted by effort — but mark priority as a starting point for the human owner, not a final ranking. 4. Return only the backlog to the conversation. Every item must be phrased as a hypothesis with a proposed variant. Do not claim guaranteed conversion improvements. Do not propose changes to the offer itself; flag offer problems as questions for the owner.
Section 7
The orchestration sketch
The merge pass matters more than it looks. Section agents overlap on purpose — the form reviewer and the accessibility reviewer will both flag unlabeled fields — and the merge keeps one finding with the strongest evidence rather than padding the backlog with echoes.
// Sketch of the orchestration script Claude Code generates for this workflow.
// agent(prompt, options) runs a subagent and resolves with its final answer.
import { writeFile } from "node:fs/promises"
const sections = ["hero", "social-proof", "pricing", "form", "footer", "accessibility", "analytics"]
const reports = await Promise.all(
sections.map((section) =>
agent(
"Use the " + section + "-section-reviewer agent rules. Review the evidence in audit/evidence/ " +
"with audit/page-context.md as the intent. Write your findings to audit/sections/" + section + ".md " +
"and return them.",
{ model: "sonnet" }
)
)
)
const backlog = await agent(
"Merge these section reports into one hypothesis backlog using audit/hypothesis-backlog-template.md. " +
"Deduplicate overlapping findings, keep the strongest evidence for each, and order by expected impact " +
"on the page's stated job weighted by effort. Every item stays phrased as a hypothesis with a proposed " +
"variant.\n\n" + reports.join("\n\n---\n\n"),
{ model: "opus" }
)
await writeFile("audit/hypothesis-backlog.md", backlog)Section 8
Step 4: the hypothesis backlog format
Every finding leaves the audit in the same shape: the observation and its evidence, why it matters for the page's job, a hypothesis in we-believe-that form, a proposed variant concrete enough to brief, and an effort estimate. The format keeps the audit honest — an item that cannot name its evidence or its proposed variant is not ready for the backlog — and it keeps the next step cheap, because the variant is already half a test brief.
## H-03 — Form: reduce required fields - Section: Form - Observation: The trial signup form requires 11 fields including phone and company size. Evidence: page-1440.png, dom.html (form#trial-signup), ga4-page.csv shows 68% of form starters do not submit. - Why it matters: The page's job is trial starts; every unjustified field adds friction at the moment of highest intent. - Hypothesis: We believe that reducing required fields to email, name, and password will increase completed signups, because the dropped fields are not used in onboarding and the abandonment data shows starters quitting mid-form. - Proposed variant: 3 required fields; move phone and company size to an optional post-signup step. - Effort: S (form config + one analytics event) - Status: hypothesis — to be validated by an A/B test, not by this audit.
Design decision
Observation with evidence
Design decision
Why it matters
Design decision
We-believe-that hypothesis
Design decision
Proposed variant
Design decision
Effort estimate
Design decision
A/B test validates
Every backlog entry moves through the same shape, so a finding that cannot complete the chain never reaches the backlog.
Section 9
Case study: the SaaS trial form with 11 fields
A SaaS team ran the audit on its trial signup page before a planned redesign. The full run took just over an hour including capture. The backlog held 17 hypotheses; the top of it was not the hero everyone had been arguing about but the form: 11 required fields, three of them unused anywhere in onboarding, with GA4 showing roughly two-thirds of form starters abandoning.
The form-field hypothesis tested first because it was small. The redesign conversation was postponed; two of the hero hypotheses went into the same testing quarter as variants instead of as a rebuild. The team's own observation afterward was that the audit's main effect was sequencing — it moved the cheap, evidenced change ahead of the expensive, debated one.
Section 10
Case study: three paid landing pages before a spend increase
An agency audited a client's three paid-traffic landing pages in one afternoon before the client doubled its media budget. The fan-out structure made the cross-page patterns obvious in the merge: none of the three pages carried the certification badges and customer logos that the client's sales team led with in every call, and only one page had a working conversion event — the other two had been silently broken since a CMS migration, which meant a quarter of reported campaign performance was unmeasurable.
The analytics gap went to the top of the backlog ahead of every creative hypothesis, with the audit explicit that fixing measurement is not a conversion improvement but the precondition for knowing whether anything else is. The spend increase was delayed two weeks; the agency kept the client.
Section 11
Case study: an ecommerce product detail page
A retailer expected the audit of a key product detail page to argue for better photography. Instead, the trust and friction findings outranked the visual ones: shipping cost and delivery time were not visible until checkout, the returns policy was a footer link to a PDF, and the review module showed a star average with no recent reviews surfaced.
The backlog proposed three small variants — a delivery estimate near the price, a one-line returns summary above the add-to-cart button, and recent reviews pulled above the fold on mobile — each marked small effort, ahead of a photography refresh marked large. The team tested the delivery estimate first. The audit did not promise it would win; it argued it was the cheapest credible thing to learn from, which is the correct shape of the claim.
Section 12
Good vs bad audit output
A weak audit reads like a teardown thread: confident, vivid, and impossible to act on without redesigning the page. A strong audit is duller and more useful — every line names its evidence, proposes a testable variant, and admits it is a hypothesis.
The hero doesn't pop and the value prop is weak
H-01: the headline names the product category but not the outcome; evidence page-1440.png; variant: outcome-led headline matching the ad copy driving 70% of traffic
This change will increase conversions by 20%
We believe reducing required fields from 11 to 3 will increase completed signups, because abandonment data shows starters quitting mid-form — to be validated by test
Add more social proof
H-07: the only proof is a logo wall of companies the audience won't recognize; variant: two customer quotes naming the measurable result, sourced from existing case studies
Track everything
H-12: no event fires on form submission, so no test on this page is currently measurable; fix before any other hypothesis is tested
The test is whether someone could brief a test from the line alone, and whether the claim survives contact with a skeptic.
Section 13
Limits: what an audit cannot prove
The audit can find friction, missing trust signals, broken measurement, accessibility failures, and hierarchy that fights the page's own job. It cannot know what will convert — only a test against real traffic can — and it cannot fix an offer that the audience does not want, a price the market will not pay, or traffic that was never going to buy. When the evidence points at the offer rather than the page, the honest output is a question for the owner, not a page hypothesis.
Prioritization is also a starting point, not a verdict. The merge pass orders by expected impact and effort as argued from the evidence, but the human who owns the page knows the roadmap, the politics, and the appetite for risk; they reorder the backlog, and that is the system working as intended.
- It cannot guarantee conversion improvements; every item is a hypothesis until tested.
- It cannot evaluate the offer, the price, or the traffic quality — only flag when they look like the real problem.
- Automated accessibility checks are a floor, not a ceiling; keyboard and assistive-technology passes remain human work.
- Findings are only as current as the capture; re-run the same commands after the page changes.
Section 14
Reusable audit workflow
Save the script to .claude/workflows/ and the section reviewers to .claude/agents/, keep the capture script and templates in the repo, and the audit becomes the standard first move on any page about to receive money or a redesign.
1. Write page-context.md: who the traffic is, what the page must get them to do, and the offer's real constraints. 2. Capture once: screenshots at 390 and 1440, the rendered DOM, an axe run, a Lighthouse run, and the GA4 export (or note its absence). 3. Run as a dynamic workflow: fan out section agents — hero, social proof, pricing, form, footer — plus accessibility and analytics reviewers. 4. Require every finding to name its evidence and propose a concrete variant. 5. Merge and deduplicate into one hypothesis backlog in we-believe-that form, with effort estimates. 6. Order by expected impact weighted by effort, and put measurement gaps first when they exist. 7. The page owner reprioritizes and picks the first tests; offer-level questions go to them, not into the backlog. 8. After changes ship, recapture with the same commands and re-run the audit against the same context file.
Sources

