Slide 1 — Video as an Agent Output
Welcome to Motion and Storytelling with Agents. This first module is about a single shift: what happens when video stops being something you assemble on a timeline and becomes something you define in code. That sounds like an engineering detail. It is not. It decides whether an agent can produce video for you at all, what a revision costs, and whether your motion work stays current as the product changes. We will look at where this approach genuinely beats traditional tools, where it clearly does not, and what your role becomes when the keyframes are no longer yours to drag. Let's start with the tools most teams reach for today.
Slide 2 — Timeline tools vs code-defined video: what each is for
Here is the comparison that frames everything else. Timeline tools produce a binary project file and an exported video. Both are fine objects — until the product changes, at which point someone has to find the file, reopen it, re-edit, and re-export. Code-defined video produces a different artifact: a text composition that lives in your repository, uses your design tokens, and re-renders from one terminal command. The MP4 becomes a build output, not the thing you maintain. And notice the row about who can edit it. A timeline needs hands on a GUI. A composition is text — which means a coding agent can write it, and a reviewer can diff it.
Slide 3 — Why agents and code-defined video fit
So why do agents and code-defined video fit so well together? Because the workflow is the one you already run. The agent reads — your tokens, your theme files, your existing compositions. It generates — a composition that is just text, either HTML with timing attributes or a React component driven by the frame number. You give feedback, and it revises — by editing the composition, never by touching a rendered file. Then it re-renders, and the same composition produces the same frames every time. Compare that with a timeline tool, where the work happens through clicks and drags an agent simply cannot perform. This is not text-to-video generation. It is the agent writing reviewable code against your design system — and that is exactly what it is already good at.
Slide 4 — Two paths to the same MP4
Here are the two paths side by side. The top lane is the familiar one: a binary project file, a hand-edited timeline, an exported video that starts going stale on the day it ships. The agent is blocked at every step — there is nothing it can read, nothing it can change, no check it can run. The bottom lane is the code-defined path. You write the brief and the script. The agent writes the composition — text, in your repository, using your tokens. It passes a review gate: diffs, lint, frame checks, a pacing pass. Then a deterministic render produces the MP4 as a build output. And the dashed line is the rule that makes the economics work: when something is wrong, you edit the composition. You never edit the video.
Slide 5 — The formats that benefit
So where does this approach genuinely win? Wherever video recurs and tracks a moving product. Feature clips and release-note videos — twenty to forty seconds, shipped every cycle, mostly on-screen text and product UI. Data stories, where the chart is drawn from a data file, so updating the video means updating the data and re-rendering. Social cuts, where one composition renders out at several aspect ratios instead of being re-laid-out by hand. And animated specs — short clips that show engineering exactly how a transition should feel, built from the same tokens the product uses. The common thread is reuse. A composition that renders once is a curiosity. One that re-renders every release is an asset.
Slide 6 — The formats that do not benefit
Now the honest half of the argument. Some video should not go through this pipeline. Brand films and campaign hero pieces are taste-led, watched once, and judged on craft no lint rule can measure. Character animation and heavy 3D will technically run in these frameworks, but you will fight the tools and the result will show it. Edit-driven storytelling — rhythm cut to music — is an editor's craft. And two cases are about honesty rather than craft: a one-off internal demo is better as a screen recording, and if the point is to show how the real product behaves, record the real product. The sorting question is simple. If the value is reuse and accuracy, use the pipeline. If the value is craft or a one-time moment, brief a specialist.
Slide 7 — Cost and iteration: what a revision actually costs
Let's talk about cost honestly, because the headline saving is not where people expect it. The first version is not the big win — writing a good brief, setting up tokens, and reviewing pacing takes real time, and a motion specialist may well produce a better first cut. The difference is every revision after that. In the timeline model, a copy change, a price update, or a brand refresh means reopening the project and re-exporting, every single time, by whoever owns the file. In the composition model, those same changes are a one-line edit and a render, and anyone on the team can ask the agent to make them. Flat cost per revision versus front-loaded setup and near-free revisions. That is the real comparison.
Slide 8 — The designer's role: script, structure, and judgment
So what is your job, if you are not the one dragging keyframes? Four things. The script — the exact words on screen and in narration, written before any composition exists. That is the single biggest quality lever in this whole workflow. The structure — shots, durations, and what each second is for. The system — the tokens, motion durations, and templates the agent has to draw from instead of inventing its own. And judgment — the pass no automated check can do: is the pacing readable, is the emphasis on the right claim, is the clip honest? Ask an agent to make it feel premium and you get placeholder enthusiasm. Give it the words, the shots, and the tokens, and you get something you can genuinely review.
Slide 9 — The artifact discipline: the composition is the deliverable
One more idea before the worked example, and it is a discipline rather than a tool. The composition is the deliverable. It lives in version control next to the product, it gets reviewed like any other artifact, and the MP4 it produces is a build output — replaceable, re-renderable, never hand-edited. This is the same pattern the wider field is converging on: HTML artifacts are already becoming standard design deliverables, and a video composition is that pattern with a timeline attached. The failure mode is tempting and quiet: someone tweaks an exported MP4 by hand because it feels faster, the source and the published video diverge, and the next re-render silently undoes the fix. Hold the line. Edit the composition.
Slide 10 — Worked example: one feature clip, two ways
Let's make this concrete with a real trace. The clip: a twenty-seven-second feature announcement — title card, two claim shots, closing card. The brief gave the agent four shots with durations, the exact on-screen words, and the token source, and told it to run the lint and inspect gates before calling it ready. The agent produced four shot files and a root timeline using the project's theme tokens. The gates earned their keep: lint caught a timing overlap, and a frame check caught a dead animation — a timeline registered but never advanced, so the shot was secretly a still image. Both fixed by editing the composition. Roughly forty-five to sixty minutes from brief to approval, as a labelled estimate — against the days to weeks a commissioned edit usually takes, mostly spent waiting.
Slide 11 — Exercise: the video work you keep postponing
Time to make this yours, and you will not need any tools. Write down three recurring video needs your team postpones or quietly skips — the feature walkthrough that never gets made, the release video that happened once, the data story that lives in a static chart. For each one, note how often it would recur and what makes it go stale. Then sort them: pipeline candidate, specialist brief, or honest screen recording. Take the strongest pipeline candidate and draft its shot list — durations and the exact words on screen. Keep that page. It becomes your working example for the next two modules. And if nothing on your list recurs, that is a real finding too: you may not need this pipeline yet.
Slide 12 — Summary, and what comes next
Let's close the module. Code-defined video turns motion into a text artifact — diffable, built from your tokens, re-renderable from the terminal — and that is what lets an agent produce and revise it inside the loop you already run. The approach wins on recurring, structured formats that track a changing product; brand films, character work, and one-off demos still belong elsewhere. The cost is front-loaded into the brief, the tokens, and the gates, and then revisions become an edit and a render. Your job is the script, the structure, the system, and the judgment — and the composition, not the MP4, is the deliverable. In Module 2 we open the tools: Remotion and hyperframe-style sequences, and how to brief them. See you there.