Section 1
Why most journey maps are fiction with good typography
Journey maps fail in a predictable way: a workshop produces a beautiful poster, the emotions row is filled in from memory, and six months later nobody can say which parts were observed and which were invented to fill an empty cell. The map gets quoted in decks long after anyone could check it, and decisions inherit its guesses.
The fix is not more workshops; it is provenance. A journey map is trustworthy exactly to the degree that every claim on it can be traced to something a user actually did or said — a funnel number, a ticket, a session note, an interview quote. Maps built that way are less symmetrical and less complete than the workshop version, and that is the point: a stage with no evidence should look empty, not confident.
The reason teams skip provenance is volume. The evidence for one journey is scattered across an analytics export, a few thousand support tickets, a folder of session-recording notes, and a research repository, and reading all of it is weeks of work. That reading-and-extracting work is what this workflow hands to agents, while the interpretation — what the journey means and what to do about it — stays with people.
Section 2
When to reach for this workflow
Use it when a journey is contested or expensive: when teams disagree about where users struggle, when a leadership group is about to fund work based on an assumed journey, or when an existing map is old enough that nobody trusts it but nobody has time to redo it. It is an Advanced workflow because the hard part is not the orchestration — it is being honest about what each data source can and cannot support.
It assumes you can export the data: an analytics funnel or event export, a support ticket export filtered to the relevant topics, notes or transcripts from session recordings, and interview themes if research exists. Anonymize before anything reaches the workflow; the agents should see behavior and language, not identities.
- An onboarding or activation journey where the funnel numbers and the team's story disagree.
- A journey spanning teams — product, support, sales — where each team only sees its own slice.
- A renewal, returns, or cancellation journey where the cost of guessing wrong is concrete revenue.
- Refreshing a stale journey map before a planning cycle, with evidence instead of recollection.
Section 3
Each source answers a different question
The four common sources do not substitute for each other, and treating them as interchangeable is how maps go wrong. Analytics says what happened and how often, but not why. Tickets say what hurt enough to make someone write, which over-represents anger and under-represents silent abandonment. Session notes show behavior in context but for a tiny sample. Interviews give language and motivation, filtered through memory and politeness.
The workflow keeps the sources separate during extraction precisely so the synthesis step can say which kind of evidence supports each claim. A pain point backed by a funnel drop, forty tickets, and three interview quotes is a different object from a pain point backed by one memorable quote, and the map should make that difference visible.
Volume, sequence, drop-off, and timing — what happened, how often, and where
Pain severe enough to report, in the user's own words, with a skew toward anger
Observed behavior, hesitation, and workarounds, for a small sample
Motivation, expectation, and emotion — filtered through memory
Account-level context and stakes, with a skew toward the deals at risk
Match the claim to the source that can actually support it.
Section 4
The orchestration pattern: evidence agents, a synthesizer, and a challenger
The mapping run is a Claude Code dynamic workflow. Putting the word workflow in the prompt, or running with /effort ultracode, makes Claude write a JavaScript orchestration script that runs in the background. The script holds the source file lists and all extracted evidence in its own variables, so thousands of tickets and a long event export never enter Claude's conversation context — only the structured extractions and the final map do. Up to sixteen agents run concurrently and a run can dispatch up to a thousand, which matters here because ticket extraction alone often fans out into dozens of batches.
Three kinds of agents do the work. Evidence agents are per-source: one per ticket batch, one per analytics export, one per folder of session notes, one per interview set, each extracting moments and signals into a shared schema with a citation back to the source row, ticket ID, or note. A synthesis agent then assembles stages, actions, thoughts, emotions, and pain points, and is required to attach evidence IDs to every cell. Finally a challenge agent audits the draft: it lists every claim whose evidence is thin, single-source, or contradicted elsewhere, and marks those stages as unverified instead of letting the map silently fill itself in.
The run is resumable, which matters when an export turns out to be malformed halfway through. Once the prompt and agents are stable, save the run from /workflows with the s key into .claude/workflows/ — project-level so the team shares it, or ~/.claude/workflows/ for a personal copy — and the next journey is a /journey-map command with different inputs. Agent definitions live in .claude/agents/*.md.
Design decision
Export and anonymize sources
Design decision
Per-source evidence agents
Design decision
Merge into evidence ledger
Design decision
Synthesize stages and pain points
Design decision
Challenge thin or conflicting claims
Design decision
Team review and decisions
Per-source extraction feeds synthesis; the challenge step runs before any human reads the draft.
Section 5
Step 1: stage the data and define the journey's edges
Before any agent runs, decide where the journey starts and ends, and write it down in one paragraph. A journey from signup to activation is a different map from a journey from first marketing touch to renewal, and an undefined edge is the first place a map starts inventing.
Then stage the exports in one folder with a manifest that says what each file is, what period it covers, and what it cannot tell you. The manifest's known-gaps line is the challenge agent's best friend: if mobile events are not instrumented, the map should say so rather than reading their absence as calm.
{
"journey": "Self-serve signup to activation (first successful report shared)",
"period": "2026-01-01 to 2026-04-30",
"sources": [
{ "id": "amp", "file": "data/amplitude-funnel-export.csv", "type": "analytics", "notes": "Signup -> verify -> create workspace -> connect data -> first report -> share. Mobile web not instrumented." },
{ "id": "zd", "file": "data/zendesk-onboarding-tickets.csv", "type": "tickets", "notes": "2,340 tickets tagged onboarding or getting-started, anonymized." },
{ "id": "sess", "file": "data/session-notes/", "type": "session-notes", "notes": "Notes from 38 recorded onboarding sessions, written by two researchers." },
{ "id": "int", "file": "data/interview-themes.md", "type": "interviews", "notes": "Themes from 12 activation interviews, Q1 research round." }
],
"known_gaps": ["No data on users who bounced before creating an account", "Mobile web behavior not instrumented"]
}Section 6
Step 2: the workflow prompt
The prompt fixes the schema the evidence agents write into and the rule the whole run lives by: no claim without a citation, no stage without evidence or an explicit unverified flag.
Run this as a workflow. Build an evidence-based journey map for the journey defined in journey/onboarding/manifest.json. Steps: 1. For each source in the manifest, dispatch evidence agents (batching tickets and session notes as needed). Each agent extracts moments and signals into journey/onboarding/evidence/<source>-<batch>.json using the schema in templates/evidence-item.json: stage_guess, observation, signal_type (behavior, pain, emotion, workaround, expectation), strength, and citation (file plus row, ticket ID, or note ID). 2. Merge all evidence files into an evidence ledger with stable IDs. 3. Dispatch a synthesis agent that assembles journey/onboarding/journey-map.md: stages, user actions, thoughts, emotions, pain points, and opportunities, where every cell lists the evidence IDs that support it. Quantify where the analytics allows it. 4. Dispatch a challenge agent that audits the draft: flag every claim supported by a single source, fewer than three evidence items, or contradicted by another source; mark affected cells as UNVERIFIED and list them in journey-map-gaps.md along with the manifest's known gaps. Rules: do not infer emotions from analytics alone; do not fill empty cells for symmetry; quotes must be verbatim from the source. Do not contact any external system; work only from the staged files.
Section 7
Step 3: the evidence and challenge subagents
Two definitions carry the discipline. The evidence extractor refuses to interpret beyond its source; the challenger exists to make weak evidence visible before a stakeholder ever sees the map.
--- name: evidence-extractor description: Extracts journey evidence items from one source batch (analytics rows, ticket batch, session notes, or interview themes) into the shared evidence schema with citations. Never interprets beyond its own source. tools: Read, Grep, Glob, Write --- You read one source batch and emit evidence items in the schema you are given. Every item must cite its origin (file plus row, ticket ID, or note ID) and quote verbatim where language matters. Record only what this source can support: analytics gives behavior and volume, not feelings; tickets give reported pain, not prevalence. If a batch contains nothing relevant to the journey, return an empty list. --- name: challenge-reviewer description: Audits a draft journey map for claims with thin, single-source, or contradicted evidence and marks them UNVERIFIED. Does not add new claims. tools: Read, Grep, Glob, Write --- You receive the draft map and the evidence ledger. For every cell, check the cited evidence: flag cells with fewer than three items, only one source type, or items that conflict with another source. Mark them UNVERIFIED in place and list them in journey-map-gaps.md with what evidence would resolve them. Never soften a flag because the surrounding map looks complete.
Section 8
Step 4: the review session is part of the workflow
The output of the run is a draft and a gaps list, not a finished map. The half-day budget includes a review session where the team walks the stages, accepts or disputes the synthesis, decides what to do about each unverified cell — gather more evidence, run research, or accept the uncertainty explicitly — and assigns owners to the top pain points.
Keep the evidence ledger next to the map permanently. When someone challenges a claim in a planning meeting four months later, the answer is a lookup, not a memory contest, and when the data changes the same saved workflow re-runs against fresh exports.
- Walk each stage: does the synthesis match what the people closest to users see?
- For each UNVERIFIED cell: gather evidence, schedule research, or accept the gap in writing.
- Pick the top three pain points by combined evidence strength and assign owners.
- Date the map and record which exports produced it, so staleness is visible later.
Design decision
Walk the draft stages
Design decision
Accept or dispute claims
Design decision
Resolve unverified cells
Design decision
Rank top pain points
Design decision
Assign owners
Design decision
Date and archive the map
The run ends with a draft and a gaps list; the review session is where the team turns evidence into accepted claims, owned pain points, and a dated map.
Section 9
Case study: a SaaS onboarding journey from signup to activation
A product-led SaaS team mapped the journey from signup to activation, defined as sharing a first report. The staged data covered four months: an Amplitude funnel export, 2,340 onboarding-tagged tickets, notes from 38 session recordings, and themes from 12 interviews. The run took just under three hours of agent time; the review session took two more.
The funnel made the shape clear: of users who verified their email, 71 percent created a workspace, 38 percent connected a data source, and 19 percent shared a report within 14 days. The evidence agents put the pain where the team had not been looking — the connect-data stage accounted for 61 percent of the onboarding tickets, and the session notes showed a median of 9 minutes spent on the credentials screen, with users repeatedly tabbing out to find where their API key lived. The interviews added the emotional register: this is where I find out whether this tool is for people like me.
The challenge agent flagged two cells the workshop version would have shipped: the emotions assigned to the verify-email stage rested on a single interview aside, and the apparent calm of the workspace-creation stage was partly an artifact of the uninstrumented mobile web flow listed in the manifest. The team funded a guided-connection project for the data stage, and the evidence ledger — particularly the 61 percent ticket concentration — is what got it prioritized over a competing homepage redesign.
Section 10
Case study: an ecommerce returns journey from tickets and reviews
A mid-size ecommerce retailer mapped the returns journey using almost no analytics: the return flow was partly handled by a third-party logistics portal the team could not instrument. The evidence was 4,800 support tickets tagged returns or refund, 1,900 product reviews mentioning returns, and the carrier's status export. The per-source agents ran ticket extraction in 24 batches.
Synthesis showed the journey's worst moment was not requesting the return but the silence after the parcel was dropped off: 44 percent of return-related tickets were where is my refund messages sent while the parcel was still in transit, and the median gap between drop-off and refund confirmation was 11 days, of which only 4 were processing. Reviews told the same story in harsher language, and the challenge agent confirmed the claim was multi-source and strong.
It also flagged what the data could not show: nothing in tickets or reviews represents the customers who returned an item smoothly and said nothing, so the map's emotion row was explicitly labeled as the journey of the unhappy minority. The team shipped proactive status emails at carrier-scan and refund-issued, and where-is-my-refund tickets dropped 31 percent the following quarter — a number the next run of the same workflow picked up from the fresh ticket export.
Section 11
Case study: a B2B renewal journey where sales notes and analytics disagree
An enterprise software company mapped the renewal journey across customer success notes, CRM opportunity notes, product usage analytics, and QBR meeting summaries. The motivating problem was a contradiction: sales notes consistently described renewals as relationship-driven and at risk over support responsiveness, while the product analytics showed renewal probability tracking most closely with whether more than five users in an account were active in the final quarter.
The workflow did not resolve the contradiction; it located it. The synthesis agent built two strands through the late-journey stages — what champions say in meetings versus what teams do in the product — and the challenge agent marked the stage where they diverged as the map's central open question rather than averaging the two stories into something nobody had observed. The evidence showed the sales narrative drew almost entirely on the 14 accounts that had escalated, while the usage signal covered all 220 renewing accounts.
The review session turned that into a decision: renewal health scoring would weight active-user breadth, and CS would keep the relationship narrative for the escalated accounts where it demonstrably applied. Six months later the team credited the map less for the score change than for ending a year-long argument, because both sides could finally see which evidence their story stood on.
Section 12
Good vs bad journey map claims
The unit of quality is the individual claim. A map full of plausible, uncited statements reads well and decays fast; a map of cited claims with visible gaps reads rougher and stays useful. Audit a few cells by hand after every run before anyone presents the map.
Users feel overwhelmed during setup
Connect-data stage: 61% of onboarding tickets, median 9 minutes on the credentials screen across 38 session notes, drop-off from 71% to 38% in the funnel (amp, zd, sess)
Customers are frustrated waiting for refunds
44% of returns tickets arrive between drop-off and refund, median 11-day gap, sentiment echoed in 312 reviews; smooth returns are not represented in this data (zd, reviews)
Renewal depends on the champion relationship
Escalated accounts (14) cite support responsiveness in CS notes; across all 220 accounts, active-user breadth in the final quarter tracks renewal more closely — divergence flagged UNVERIFIED pending churn interviews
A trustworthy claim names its evidence and its limits.
Section 13
What this workflow cannot prove
An evidence-based map is still a map of the evidence you happened to have. It cannot speak for the users who never showed up in any export, it cannot turn correlation in a funnel into cause, and it cannot read emotion off behavioral data without a human deciding the inference is fair. The challenge agent makes those limits visible; it does not remove them.
The map also does not decide anything. Which pain point to fund, whether an unverified stage is worth a research study, and how much to trust a source the team knows is biased are judgment calls that belong to the people in the review session. The workflow's contribution is that those calls are now made looking at evidence instead of at the loudest recollection in the room.
- It cannot represent users absent from the data, including those who left before generating any.
- It cannot establish causation from funnels and tickets; it can only locate where to investigate.
- It cannot assign emotions from behavior alone; emotion claims need interview or verbatim support.
- It cannot prioritize the roadmap; the review session is where humans decide.
Section 14
The reusable journey mapping workflow
Save the run as /journey-map, keep the evidence schema and agents in version control, and re-run it on fresh exports each quarter so the map stays a living document instead of a poster.
1. Define the journey's start and end in one paragraph; agree on it before any extraction. 2. Stage anonymized exports with a manifest that records periods, source types, and known gaps. 3. Fan out per-source evidence agents that extract cited moments and signals into the shared schema. 4. Merge everything into an evidence ledger with stable IDs. 5. Synthesize stages, actions, emotions, pain points, and opportunities, every cell citing its evidence. 6. Run the challenge agent: mark thin, single-source, or contradicted claims UNVERIFIED. 7. Hold the review session: accept or dispute claims, decide what to do about each gap, assign owners. 8. Date the map, keep the ledger beside it, and re-run the saved workflow on fresh exports next quarter.
Sources

