Connectors
External API integrations — Kling, Gemini, OpenAI — and credential management.
Connectors
Connectors are Python modules in connectors/ that wrap external vendor APIs. They turn a vendor's SDK or HTTP endpoints into clean Python functions that Montaj steps can call.
Overview
The Layering Rule
Connectors are organized by vendor. Steps are organized by use case.
| Layer | Organized by | Example |
|---|---|---|
connectors/<vendor>.py | Vendor — one file per API key | connectors/gemini.py handles video analysis + image gen |
steps/<verb>_<noun>.py | Use case — one file per action | steps/generate_image.py dispatches to gemini or openai |
A vendor like Gemini unlocks multiple use cases (video analysis, image generation) through one API key and one SDK. The connector owns auth, request shape, polling, and response normalization. The step owns the agent-facing interface.
Architecture
cli/commands/<step>.py # thin argparse wrapper (agent-facing)
serve/server.py # /api/steps/{name} dispatch (agent-facing)
mcp/server.js # introspects CLI parsers (agent-facing)
│
▼
steps/<verb>_<noun>.py # argparse + fail() + stdout (one per use case)
│
▼
connectors/<vendor>.py # SDK/HTTP calls (one per vendor)
│
▼
lib/credentials.py # ~/.montaj/credentials.json + env overrideKey Rules
- Connectors are never agent-callable directly — workflows, CLI, HTTP API, and MCP all dispatch to steps
- No vendor SDK at import time — imports are lazy, inside functions
- Credentials only from
lib.credentials— never read env vars directly - Errors via
ConnectorError— step code catches and translates tofail() - Long operations block — connectors return when done, they don't return job IDs
Installing
montaj install connectors # installs pyjwt, requests, google-genai, openai
montaj install credentials # interactive setup for API keysCurrent Connectors
| Vendor | Functions | Steps | Model(s) | Credentials |
|---|---|---|---|---|
kling.py | generate | kling_generate | kling-v3-omni | kling.access_key, kling.secret_key |
gemini.py | analyze_media, generate_image | analyze_media, generate_image | gemini-2.5-flash, gemini-3-pro-image-preview | gemini.api_key |
openai.py | generate_image | generate_image | gpt-image-1 | openai.api_key |
Multi-Provider Steps
A single step can dispatch to multiple connectors. For example, steps/generate_image.py dispatches to either connectors/gemini.py or connectors/openai.py based on a --provider flag:
montaj generate-image --prompt "portrait" --provider gemini --out portrait.png
montaj generate-image --prompt "portrait" --provider openai --out portrait.pngSame step name, same CLI interface, different backend.
Kling v3
Kling v3 Omni is the video generation connector. It supports text-to-video, image-to-video (first frame, last frame), and reference-guided generation.
Setup
montaj install connectors
montaj install credentials --provider kling --key access_key --value YOUR_ACCESS_KEY
montaj install credentials --provider kling --key secret_key --value YOUR_SECRET_KEYOr interactively:
montaj install credentials # select kling, enter keysCredentials are stored in ~/.montaj/credentials.json (0600 permissions).
Credentials
| Key | Description |
|---|---|
kling.access_key | Kling API access key |
kling.secret_key | Kling API secret key |
Get your keys at app.klingai.com.
Usage
Text-to-Video
montaj kling-generate \
--prompt "a calico cat walking through a sunlit kitchen, cinematic" \
--out /tmp/cat.mp4Image-to-Video (First Frame)
montaj kling-generate \
--prompt "slow zoom in" \
--first-frame frame.png \
--out /tmp/zoom.mp4Image-to-Video (First + Last Frame)
montaj kling-generate \
--prompt "character walks left" \
--first-frame start.png \
--last-frame end.png \
--out /tmp/walk.mp4Style Reference
montaj kling-generate \
--prompt "same style" \
--ref-image style1.png \
--ref-image style2.png \
--out /tmp/styled.mp4Pro Mode
montaj kling-generate \
--prompt "cinematic scene" \
--out /tmp/pro.mp4 \
--mode pro \
--duration 10 \
--aspect-ratio 9:16Parameters
| Param | Default | Description |
|---|---|---|
--prompt | required | Generation prompt |
--out | required | Output file path |
--first-frame <img> | — | Starting image for image-to-video |
--last-frame <img> | — | Ending image (requires --first-frame) |
--ref-image <img> | — | Style reference image (repeatable, max 3) |
--duration <3-15> | — | Video duration in seconds |
--negative-prompt | — | What to avoid in generation |
--sound <on|off> | — | Enable/disable sound |
--aspect-ratio | — | 16:9, 9:16, 1:1 |
--mode <std|pro> | std | Standard (cheaper/faster) or Pro (higher quality) |
Model
Uses kling-v3-omni (hardcoded). The connector handles JWT authentication, request construction, polling for completion, and downloading the result.
Gemini
The Gemini connector wraps Google's Gemini API for two use cases: media analysis (video, audio, image) and image generation.
Setup
montaj install connectors
montaj install credentials --provider gemini --key api_key --value YOUR_API_KEYOr interactively:
montaj install credentials # select gemini, enter keyCredentials
| Key | Description |
|---|---|
gemini.api_key | Google Gemini API key |
Get your key at ai.google.dev.
Media Analysis (analyze_media)
Analyze any media file — video, audio, or image — with a natural language prompt.
montaj analyze-media clip.mp4 --prompt "Describe the scene in 2 sentences."
montaj analyze-media song.mp3 --prompt "Transcribe with timestamps."
montaj analyze-media photo.jpg --prompt "Return JSON: {subject, mood, dominant_colors}" --json-output
montaj analyze-media clip.mp4 --prompt "..." --model gemini-2.5-pro| Param | Default | Description |
|---|---|---|
<input> | required | Media file (video, audio, image) |
--prompt | required | Analysis prompt |
--model | gemini-2.5-flash | Model override |
--json-output | — | Request structured JSON from the model |
--out | — | Write output to file |
Note: Images under approximately 18 MB take a fast inline path — no Files API round-trip needed.
Image Generation (generate_image)
Generate images from text prompts, optionally conditioned on reference images.
montaj generate-image --prompt "portrait, studio lighting" --out portrait.png
montaj generate-image --prompt "same character, profile view" --ref-image portrait.png --out profile.png
montaj generate-image --prompt "..." --provider gemini --aspect-ratio 9:16 --out tall.png| Param | Default | Description |
|---|---|---|
--prompt | required | Generation prompt |
--out | required | Output file path |
--ref-image | — | Reference image (repeatable) |
--aspect-ratio | — | Aspect ratio (Gemini-specific) |
--model | gemini-3-pro-image-preview | Model override |
Models
| Use Case | Default Model |
|---|---|
| Media analysis | gemini-2.5-flash |
| Image generation | gemini-3-pro-image-preview |
OpenAI
The OpenAI connector wraps OpenAI's image generation API.
Setup
montaj install connectors
montaj install credentials --provider openai --key api_key --value YOUR_API_KEYOr interactively:
montaj install credentials # select openai, enter keyCredentials
| Key | Description |
|---|---|
openai.api_key | OpenAI API key |
Get your key at platform.openai.com.
Image Generation (generate_image)
Generate images from text prompts, optionally with reference images.
montaj generate-image --prompt "red apple on white table" --provider openai --out apple.png
montaj generate-image --prompt "same scene, sunset" --provider openai --ref-image scene.png --out sunset.png| Param | Default | Description |
|---|---|---|
--prompt | required | Generation prompt |
--out | required | Output file path |
--provider | — | Must be openai to use this connector |
--ref-image | — | Reference image (repeatable) |
--size <WxH> | — | Image dimensions |
--model | gpt-image-1 | Model override |
Gemini vs. OpenAI
The generate_image step supports both Gemini and OpenAI as providers. Choose based on:
- Gemini — supports
--aspect-ratioflag, good for specific dimensions - OpenAI — supports
--size WxHflag, different artistic style
# Gemini (default)
montaj generate-image --prompt "portrait" --out portrait.png
# OpenAI
montaj generate-image --prompt "portrait" --provider openai --out portrait.pngCredentials
API credentials for external connectors live in ~/.montaj/credentials.json with 0600 permissions.
Installation Methods
Interactive
montaj install credentials
# Prompts for provider selection and key entryScripted (CI/Automation)
montaj install credentials --provider kling --key access_key --value YOUR_KEY
montaj install credentials --provider kling --key secret_key --value YOUR_KEY
montaj install credentials --provider gemini --key api_key --value YOUR_KEY
montaj install credentials --provider openai --key api_key --value YOUR_KEYCheck Status
montaj install credentials --list
# Shows set/unset status per providerCredential Precedence
Each connector reads credentials via lib.credentials.get_credential(provider, key). The precedence order:
- Environment variable — e.g.,
KLING_ACCESS_KEY,GEMINI_API_KEY - Credentials file —
~/.montaj/credentials.json - Fail — with install instructions
Supported Providers
| Provider | Keys | Environment Variables |
|---|---|---|
kling | access_key, secret_key | KLING_ACCESS_KEY, KLING_SECRET_KEY |
gemini | api_key | GEMINI_API_KEY |
openai | api_key | OPENAI_API_KEY |
Credentials File Format
{
"kling": {
"access_key": "...",
"secret_key": "..."
},
"gemini": {
"api_key": "..."
},
"openai": {
"api_key": "..."
}
}The file is stored at ~/.montaj/credentials.json with 0600 permissions (owner read/write only).
Using in CI/CD
For CI environments, use environment variables instead of the credentials file:
export KLING_ACCESS_KEY=...
export KLING_SECRET_KEY=...
export GEMINI_API_KEY=...
export OPENAI_API_KEY=...Environment variables take precedence over the credentials file, so this works without any additional configuration.