MontajMontajdocs

Introduction

What is Montaj, how it works, and who it's for.

Montaj

Montaj is an open-source video editing toolkit for AI agents. It is CLI-first, agent-native, and runs entirely on your local machine.

What Montaj Is

Montaj is a CLIP — a CLI Program for agents. It clips onto your existing AI agent (Claude Code, Cursor, or any harness) and gives it the specialized tools to edit video. Built-in steps cover the full editing pipeline. The agent decides what to run, in what order, and with what parameters.

The fundamental dependency is an agent. Montaj does not edit on its own. It provides the tools; the agent makes the creative decisions.

Who It's For

  • AI agent developers who want to give their agents video editing capabilities
  • Content creators who use AI coding assistants and want to automate post-production
  • Developers building video processing pipelines with agent orchestration

How It Works

1. Upload clips + write an editing prompt
2. montaj creates project.json [pending]
3. Agent picks it up, reads the workflow, calls steps as tools
4. Agent writes project.json as it works — UI updates live via SSE
5. Agent marks project [draft]
6. Human reviews in browser (optional) — tweaks — marks [final]
7. Render engine → final MP4

What's Inside

steps/              Step executables + JSON schemas (probe, trim, transcribe, etc.)
workflows/          Suggested editing plans (overlays.json, tight-reel.json, etc.)
render/             React + Puppeteer + ffmpeg render engine
serve/              Local HTTP + SSE server (montaj serve)
ui/                 Browser UI (Vite + React + Tailwind)
connectors/         External API wrappers (Kling, Gemini, OpenAI)

Key Principles

  • Agent-native interface — CLI, HTTP, and MCP; steps are callable from any harness without writing code
  • Editing existing footage — trim, cut, transcribe, composite against source clips
  • Animation generation — agent can generate React overlay components rendered frame-by-frame via headless Chrome
  • Local-first — ffmpeg + whisper.cpp, no external APIs required (just an agent)
  • Open source — MIT, self-hosted, no vendor lock-in

Agent Agnostic

Montaj exposes three interfaces for agents to call steps — CLI, MCP, and HTTP API. Neither is mandatory. The agent uses whichever it has access to. All three wrap the same underlying executables.