
For years, turning a web page into a professional MP4 video meant one of three things: screen recording with OBS (and praying for no dropped frames), wrangling a timeline in After Effects, or wrestling with cloud rendering services that charged by the second. None of those were developer-friendly, and none of them worked for AI agents.
HyperFrames changes all of that.
Released by HeyGen and open-sourced under Apache 2.0, HyperFrames is an HTML-to-video rendering engine that's already racked up over 33,000 GitHub stars and 2,100+ commits in a matter of months. The pitch is deceptively simple: write HTML, render video. But the implications are massive — especially in a world where AI coding agents are increasingly doing the writing.
At its core, HyperFrames treats every HTML element as a video "clip." You define a timeline directly in your markup using data-* attributes:
<div id="stage" data-composition-id="launch"
data-width="1920" data-height="1080">
<video class="clip" data-start="0" data-duration="6"
data-track-index="0" src="intro.mp4" muted></video>
<h1 id="title" class="clip" data-start="1" data-duration="4"
data-track-index="1">Launch day</h1>
<audio data-start="0" data-duration="6"
data-track-index="2" src="music.wav"></audio>
</div>
The "Rule of Three" governs every composition: a root element with composition metadata, timed elements with class="clip" and timing attributes, and animations registered on window.__timelines with { paused: true }.
Under the hood, the rendering pipeline is refreshingly straightforward: your HTML loads in headless Chrome, the engine seeks each frame precisely (frame = floor(time * fps)), captures it via Chrome's beginFrame API, and pipes everything through FFmpeg for encoding. Same input, identical output — every single time. This determinism is a game-changer for CI/CD pipelines, automated testing, and any workflow where reproducibility matters.
Run npx hyperframes preview and you get live-reload previews in your browser. Run npx hyperframes render --output demo.mp4 and you've got a finished video. No build step. No proprietary format. Just HTML.

Here's the insight that makes HyperFrames genuinely special: LLMs are already excellent at writing HTML. It's arguably the format they're most proficient at generating, period. Most video tools require complex APIs or drag-and-drop interfaces that agents can't operate. HyperFrames compositions are plain HTML documents — the format AI agents are best at generating.
The framework ships 21 specialized skills that coding agents load on demand. Install them with a single command:
npx skills add heygen-com/hyperframes
Then you can literally tell your coding agent:
"Using /hyperframes, create a 10-second product intro with a fade-in title, a background video, and subtle background music."
The agent plans the video, writes valid HTML, wires seekable animations, adds media, lints, previews, and renders — all autonomously. The CLI is non-interactive by default (flag-driven with plain text output), so agents can drive every command without prompts or parsing.
The skill ecosystem covers the full production spectrum: /product-launch-video, /website-to-video, /faceless-explainer, /pr-to-video, /talking-head-recut, /motion-graphics, /music-to-video, and more. Need animated captions on an existing talking-head video? /embedded-captions. Want to port a Remotion project? /remotion-to-hyperframes.
Plugins exist for Claude Code, Cursor, Codex, and Gemini CLI. This isn't a side project — it's a full-fledged platform with cloud rendering via AWS Lambda (lambda deploy / render / progress) and Google Cloud Run adapters.

If you've been in the programmatic video space, you've almost certainly encountered Remotion — the React-based video creation library that's been the gold standard for years. Both tools render video with headless Chrome and FFmpeg. So what's the difference?
Remotion is built for human developers. It requires a Node.js/React setup, familiarity with React component patterns, and a non-trivial amount of boilerplate. It's excellent for teams building video tooling into applications or automating production pipelines — but the scaffolding is heavy for quick, agent-driven tasks.
HyperFrames strips the abstraction layer away. No React. No component system. No build step. The agent writes a plain HTML file — the kind any capable LLM can produce in seconds — and the renderer does the rest.
| Criteria | Remotion | HyperFrames |
|---|---|---|
| Primary users | React developers | AI agents, developers |
| Language | React (JSX) | Plain HTML/CSS/JS |
| Setup complexity | Moderate | Low |
| AI agent suitability | Medium | High |
| Best for | Production pipelines | Agentic, lightweight video |
The verdict? If you're a team building a long-term video product, Remotion is likely the better investment. If you're building AI agents that need to produce video content without complex toolchains — or you just want to go from idea to MP4 in 60 seconds — HyperFrames is the clear winner.
The use cases span far beyond simple text animations:
The catalog includes 50+ ready-to-use blocks and components — shader transitions (flash-through-white), social overlays (instagram-follow), animated charts (data-chart), and cinematic effects. Install any of them with npx hyperframes add <block-name>.
Built-in design templates range from the professional swiss-grid (corporate/technical demos) to the editorial nyt-graph (data storytelling) and the energetic play-mode (social media).
Requirements: Node.js 22+ and FFmpeg. That's it.
# Scaffold a new project
npx hyperframes init my-video
# Jump in
cd my-video
# Preview with live reload
npx hyperframes preview
# Render to MP4
npx hyperframes render --output final.mp4
For AI agent workflows, install the skills:
npx skills add heygen-com/hyperframes
Then describe what you want. No timeline editors. No proprietary formats. Just code.
HyperFrames represents something more significant than just another video tool. It validates a thesis that's been brewing for a while: as AI agents become first-class participants in software development, the tools that win will be the ones agents can actually use.
HTML is the universal language of the web — and now, thanks to HyperFrames, it's also the universal language of programmatic video. The barrier between "building for the browser" and "producing video content" has effectively collapsed. With 33K stars on GitHub, 21 agent skills, cloud rendering, and a release cadence that's pushing updates every few days (v0.7.25 dropped on July 2, 2026), this is a project worth paying attention to.
Stop dragging clips. Start writing them.