Slide 1 — Canvas-to-Production Pipelines
Welcome to Module 5. This is the module about the workflow everyone asks for: how an approved design on a canvas becomes production code, with agents doing the production work and humans holding the gates. The short version of the argument is that the pipeline replaces the handoff. Instead of a designer finishing a picture and a developer rebuilding it from scratch — losing detail at every step — the artifact feeds an agent, the agent builds on a branch, and a sequence of gates checks the result against the design and the intent. Generation is the easy part. The pipeline exists for everything after the demo looks right.
Slide 2 — Six stages, two owners
Here are the six stages. An approved canvas, owned by a human, carrying real structure — components, tokens, and behaviour written down. Spec extraction, where canvas values get resolved against the actual token and component library. Implementation, where the agent builds on a branch against the harness. Then the checks: types, tokens, accessibility, and breakpoints, compared against the canvas. Then review, where a human judges what the checks cannot see. And finally merge, which is always a human decision. Notice the ownership split: the agent owns the mechanical middle, and you own the artifact going in and the judgment coming out. There is no stage called handoff — that is the point.
Slide 3 — The pipeline, end to end
Here is the whole pipeline on one picture. The approved canvas on the left is yours — it carries the structure and the intent. Then the agent takes over the middle: it resolves tokens and components against your real library, generates code on a branch, and runs the executable checks and visual QA against the canvas. Then the gates hand control back to you: the PR review judges what the scans cannot see, and the merge is always a human decision. The most important line on this diagram is the dashed one. Work that fails a gate goes back to the agent and re-enters at the gate it failed. It never moves forward with a note attached.
Slide 4 — Gates: owner, evidence, and what each one exists to catch
Let's define a gate properly. A gate has three properties: a tool or procedure that runs it, an owner who acts on the result, and a named defect class it exists to catch. The type and token gates are agent-owned — the agent runs them and fixes what they find. The responsive gate splits the work: the agent captures screenshots at three widths, and you judge whether the task order survived. Accessibility is shared, because the automated scan only sees part of the standard. And the final review is yours. Here is the test of whether you really have gates: when one fails, does anything change? If not, you have theatre with logging.
Slide 5 — Token and component sync are pipeline steps, not assumptions
Here is the silent failure that wrecks most canvas-to-code attempts: assuming the canvas and the codebase already agree about tokens and components. They rarely do, and when they disagree, the agent resolves it silently — usually by hardcoding whatever the canvas shows. So make the sync an explicit step. Resolve every canvas value to a named token before generation, and flag what does not map instead of inventing a substitute. Map frames to your real component library — with Figma that means Code Connect, with connected canvases the mapping is closer to free. The evidence keeps landing in the same place: the mapping work, not the generator, decides whether you keep the code or rewrite it.
Slide 6 — Parity checks: comparing the build against the canvas, with evidence
Parity is not a feeling, it is evidence. The agent captures screenshots of the built page at three-sixty, seven-sixty-eight, and twelve-eighty, and pairs each one with the approved canvas frame. Then you judge — and you are judging intent, not pixels. Did the hierarchy survive? Did the task order survive on mobile? Do the states the canvas showed actually exist? Expect the documented gaps: drifted colours, placeholder data, unfinished interactions. That is what this gate is for. The agent gathers the evidence and can flag obvious differences; you decide which differences matter, and every finding gets a severity and a screenshot attached.
Slide 7 — Plausible but wrong: the defect classes you can name in advance
The reason gates are worth building is that the defects are predictable. Code that compiles but ignores your tokens. Markup that looks right and reads wrong. Output that passes the automated scan and still traps the keyboard — because the scanners only see roughly a third of the standard. Responsive variants that just stack things instead of preserving the task order. And product rules quietly invented from whatever the static frame happened to show. Generated UI fails accessibility by default — not from carelessness, but because nothing in 'make it look like this frame' asks for semantics. Each of these classes maps to a gate. That is not a coincidence; it is the design of the pipeline.
Slide 8 — Where defects enter: a real defect log, by stage
Let's look at a real defect log. This is from an executed run in this school's own repository: one section, three iterations. The type check caught a real runtime bug. The token audit caught eleven hardcoded colours. Then the human checks took over: a fixed-width wrapper that would break on phones, a link with no accessible name, a heading-level problem — and the most interesting one, card titles that were not headings at all, which no automated check flagged. Automated gates caught two of seven defect classes. The point of logging by stage is what you do next: recurring colour violations mean tighten the harness, not review harder. Measurement tells you which stage owns the fix.
Slide 9 — Partial automation: which stages run unattended
So how much of this can run without you? Use one dividing line: automate the stages whose failures are cheap and detectable, keep humans on the stages whose failures are expensive and silent. Resolution, generation on a branch, the executable checks, and screenshot capture can all run unattended — ideally in CI, as assertions that fail the build. The artifact going in, the severity calls, the accepted findings, and the merge stay human. And every unattended stage has to leave evidence behind: outputs, screenshots, and a report of what it flagged. One more filter: simple surfaces are good candidates for the fast path. Checkout flows and permission screens are not.
Slide 10 — Worked example: one section through the full pipeline
Here is the same run traced as a pipeline with timings. The brief took about ten minutes. The first generation took minutes — and failed four gates: a type error, eleven token violations, a fixed-width layout, and accessibility problems. The rebuild on real primitives took about twenty minutes and passed both automated gates. Then human review caught the one that matters: card titles that were no longer headings at all. Every command had exited zero. The fix took minutes, the gates re-ran, and one finding was accepted and written down. Three iterations, one working session — and most of it was gates and review, not generation. Report your time that way, or you are booking the review hours as free.
Slide 11 — The pipeline as a checklist the agent can run
To make this a team capability rather than a personal habit, the gate sequence has to live in the repository. Here is the shape: entry conditions, generation outside production paths, the type and design-system gate the agent runs and fixes, the responsive gate the agent captures and you judge, the shared accessibility gate, the performance budget, the human review, and the after-merge notes. Two rules make it work. The checklist goes into the brief, so the agent runs every gate it can run itself. And the agent stops and reports at the decision points — it never fixes and merges in one motion. Adapt the commands to your stack; keep the order.
Slide 12 — Exercise: draw your current path from canvas to production
Time to map your own pipeline. Pick one design that recently shipped and trace the path it actually took — every step, and who did each one. Mark the gates that existed, and be honest about what each one required and what happened when it failed. Then mark the gaps: transitions with no check, checks with no owner, findings that were waved through. Now the key question: where were the defects in that work actually found? Anything found at the end probably belonged to an earlier gate. Finish by choosing the one stage you would tighten first, and write down what evidence its gate would require. One stage. Then run the next piece of work through it.
Slide 13 — Summary, and what comes next
Let's close the module. The pipeline replaces the handoff: the agent owns the mechanical middle, you own the artifact going in and the judgment coming out. A gate is a tool, an owner, and a named defect class — and when it fails, the work goes back, not forward. Token sync and parity checks are explicit steps with evidence, never assumptions. The automated checks catch the cheap defects; the expensive ones need a checklist and a human with the intent in mind. And measure defects by stage, so you tighten the stage responsible instead of piling review onto the end. Module 6 zooms out to the whole operation — cost, permissions, audit trails, and review capacity. See you there.