Slide 1 — MCP Chaining Across Design Tools
Welcome to Module 3. So far this course has talked about agents and patterns. This module is about the plumbing that lets those patterns reach your actual tools: MCP, and specifically what happens when one agent session has several MCP servers active at once — a research board, a design canvas, the repository, the browser. A single connection is convenient. A chain is something more: it lets the agent compare sources of truth and move work between systems without you ferrying screenshots and hex codes by hand. It also fails in new ways, costs more context, and raises real questions about credentials and data boundaries. That is what we cover here.
Slide 2 — Why chains, not just connections
Why chain at all? Because the workflows that matter need two sources of truth at once. Token sync compares the design tool's variables against the tokens in your repo. Content sync pulls real research off a board and into your mockups. An implementation review reads intent from the canvas and reality from the browser, and reports the difference. None of that works with a single connection. The mechanism is simple: several MCP servers active in one session, and the agent calls tools across all of them, with every result landing in the same context window. The chain is just the order of those calls. The cost is real too — more credentials, more failure points, more context burned — and managing that cost is most of this module.
Slide 3 — The design-relevant servers, mapped to roles
Here is the landscape, organised by role rather than logo. Design canvases — Figma's official server, Pencil, Paper — hold the intent: variables, structure, screenshots, and increasingly the ability to write back. Research tools like Miro and Notion hold the content and the findings. Browser servers — Playwright, Chrome DevTools — hold the evidence: screenshots, console errors, what the page actually does. Ticketing servers connect the chain to where your team tracks work. And the repository is in every chain without needing a server at all, which is exactly why it is the safest place to write: everything lands as a diff you can review and revert. Pick one server per role. You almost never need them all.
Slide 4 — Chain design: which tool feeds which step
Before you connect anything, design the chain on paper. List the steps of the workflow, then assign one server to each step. Mark every link as read or write — and notice that most links should be reads. Decide which system is the source of truth for each kind of fact: who owns the tokens, who owns the content, who owns layout. That single decision prevents the worst chain failure, where two systems both think they are authoritative and the agent happily overwrites one with the other. Then place your human gates, usually between the last read and the first write. And finally, write the chain down in the harness, so it runs the same way in every session — not just the one where you remembered the magic prompt.
Slide 5 — One session, several servers
Here is the whole module in one picture. One agent session, four links. The research board feeds content and findings — read-only. The design canvas holds the intent — the agent reads variables, structure, and screenshots, and if it writes, it writes behind a prompt to a clearly named page. The repository is where the artifact lives, and its writes are branches and diffs, which is why it is the safest place to write. The browser closes the loop with screenshots and console output — evidence, never edits. And along the bottom, the four places chains actually break: credentials rejected on remote servers, local apps that simply are not running, connections that go stale in long sessions, and context burned on screenshots before the reasoning starts. None of that is exotic. All of it is plannable.
Slide 6 — Scopes and credentials: least access that still works
Every server in your chain is a dependency with access to real tools, so scope it the way you would scope any dependency. Connect servers per project, in a checked-in file, so each project only sees what it needs. Authenticate with an account that can reach this project's files — not the whole workspace. And split the tools: pre-approve the read tools the chain actually uses, and leave every write tool on ask. That little approve click before a canvas edit or a file write feels like friction, but it is the cheapest gate in the whole chain — and it is also your defence against content from a board or a file smuggling instructions into the agent. The posture in this snippet is itself reviewable, which is the point: it lives in the repo, not in someone's head.
Slide 7 — Data boundaries: what may leave which system
A chain exists to move data between systems, so somebody has to decide which moves are allowed — and it should be you, before the run, not the agent, during it. Map what each server can see. Know which links are remote, because remote means the data leaves your machine. Be careful with research: participant quotes were collected under consent that probably did not mention pushing them onto a shared canvas. And in client work, make the boundary structural — per-project servers and per-project credentials, so the chain simply cannot reach the wrong client's files. Then write the allowed flows into the harness in plain language. The chain will move whatever it can reach; data boundaries are how you decide what it can reach.
Slide 8 — Failure handling: loud and partial beats silent and complete
Now the part that separates a demo chain from one you can rely on: what happens when a link fails. The principle is simple — fail loudly and partially, never silently and completely. The most expensive thing a chain can produce is a polished-looking result built around a link that quietly broke: a token report that never actually read the variables, a review against a stale screenshot. So decide the behaviour per failure class. Credentials rejected: report the exact error and fall back to the local path if there is one. App not running: stop and ask, because the fix is human. Stale connection: retry once, then surface it. Partial reads: carry on, but label the gap. And a failed write stops the chain, full stop — nothing runs downstream of a write you cannot confirm.
Slide 9 — Observability: knowing what the chain actually did
Observability for a design chain is not dashboards — it is being able to answer, afterwards, what did this chain read, what did it write, and what did it skip. Keep the tool-call trail instead of letting it vanish with the terminal. Have the chain write a small dated status record: what succeeded, what failed, what was skipped — our own sync pipeline does exactly this, and the day the remote write path was blocked, that status file was the difference between evidence and guesswork. Aim writes at reviewable surfaces: branches, named canvas pages, drafts. And watch the context spend, because a chain drowning in screenshots degrades quietly. Then review the trail, not just the deliverable — the trail is where silent failures show up.
Slide 10 — Worked example: research to canvas to code to review
Let's trace one chain end to end. The job: take researched content, build it in the product's real materials, and verify the result against the design source. Step one, read the research board — read-only — and have a human confirm scope while changing course is still cheap. Step two, read the design canvas: variables, structure, a reference screenshot. That is the design contract. Step three, build — writing to the repository, on a branch, never to a live surface. Step four, point the browser at the result: screenshots at two breakpoints, console output, compared against the contract and reported as P0 to P3 findings. Step five, write the status record. Eight to twelve tool calls, a few minutes of agent time — and expect the first attempt to need one retry, usually because something was not running.
Slide 11 — Field evidence: where our own chain broke
This is not theoretical — we run a chain like this against this school's own design system, and it broke in instructive places. Syncing code to an OpenPencil canvas works reliably, but only one way: the direction where the agent writes back to the canvas still depends on a desktop bridge that does not always connect. And pushing the same design system to Figma over the remote MCP server failed at OAuth registration — a 403 before a single tool was even listed. The payload was fine; the credential boundary was the problem, and that is where these workflows most often break. Both runs still shipped, because fallbacks existed — a file-backed CLI build, a local plugin — and because every run writes a dated status file, so the next attempt started from evidence, not memory.
Slide 12 — Exercise: map the chain for one of your workflows
Your turn. Pick one workflow you already run by hand — a token sync, an implementation review, refreshing mockups with real content — and design its chain on a single page. List the steps and assign one server, or the repo, to each, marked read or write. Name the source of truth for every kind of fact. Write the permission posture you would actually check in. Mark what may leave which system. And for every link, one line of failure handling — including the fallback for the link most likely to be blocked, which is usually the remote one. Then apply the test: more than a couple of write links, or no fallback for the remote ones, and the chain is not ready to run yet. Keep the page — you will reuse this workflow in the next module.
Slide 13 — Summary, and the shared canvas ahead
Let's close. Chain when the workflow needs a comparison or a handover between systems, and use one server per role — not everything you own. Design the chain on paper first: roles, direction, a single source of truth per fact, and gates before the writes. Scope the credentials per project, pre-approve only the reads, keep writes behind a prompt. Build for failure: loud and partial, with a fallback for the credential boundary, and a hard stop on any unconfirmed write. And keep the trail — tool calls, a dated status record, writes on reviewable surfaces. In Module 4 we flip the picture: instead of one agent across many tools, several agents on one shared canvas at the same time — and the conventions and review rhythm that stop that from becoming chaos. See you there.