MontajMontajdocs

Connectors

External API integrations — Kling, Gemini, OpenAI — and credential management.

Connectors

Connectors are Python modules in connectors/ that wrap external vendor APIs. They turn a vendor's SDK or HTTP endpoints into clean Python functions that Montaj steps can call.


Overview

The Layering Rule

Connectors are organized by vendor. Steps are organized by use case.

LayerOrganized byExample
connectors/<vendor>.pyVendor — one file per API keyconnectors/gemini.py handles video analysis + image gen
steps/<verb>_<noun>.pyUse case — one file per actionsteps/generate_image.py dispatches to gemini or openai

A vendor like Gemini unlocks multiple use cases (video analysis, image generation) through one API key and one SDK. The connector owns auth, request shape, polling, and response normalization. The step owns the agent-facing interface.

Architecture

cli/commands/<step>.py     # thin argparse wrapper (agent-facing)
serve/server.py            # /api/steps/{name} dispatch (agent-facing)
mcp/server.js              # introspects CLI parsers (agent-facing)


steps/<verb>_<noun>.py     # argparse + fail() + stdout (one per use case)


connectors/<vendor>.py     # SDK/HTTP calls (one per vendor)


lib/credentials.py         # ~/.montaj/credentials.json + env override

Key Rules

  • Connectors are never agent-callable directly — workflows, CLI, HTTP API, and MCP all dispatch to steps
  • No vendor SDK at import time — imports are lazy, inside functions
  • Credentials only from lib.credentials — never read env vars directly
  • Errors via ConnectorError — step code catches and translates to fail()
  • Long operations block — connectors return when done, they don't return job IDs

Installing

montaj install connectors        # installs pyjwt, requests, google-genai, openai
montaj install credentials       # interactive setup for API keys

Current Connectors

VendorFunctionsStepsModel(s)Credentials
kling.pygeneratekling_generatekling-v3-omnikling.access_key, kling.secret_key
gemini.pyanalyze_media, generate_imageanalyze_media, generate_imagegemini-2.5-flash, gemini-3-pro-image-previewgemini.api_key
openai.pygenerate_imagegenerate_imagegpt-image-1openai.api_key

Multi-Provider Steps

A single step can dispatch to multiple connectors. For example, steps/generate_image.py dispatches to either connectors/gemini.py or connectors/openai.py based on a --provider flag:

montaj generate-image --prompt "portrait" --provider gemini --out portrait.png
montaj generate-image --prompt "portrait" --provider openai --out portrait.png

Same step name, same CLI interface, different backend.


Kling v3

Kling v3 Omni is the video generation connector. It supports text-to-video, image-to-video (first frame, last frame), and reference-guided generation.

Setup

montaj install connectors
montaj install credentials --provider kling --key access_key --value YOUR_ACCESS_KEY
montaj install credentials --provider kling --key secret_key --value YOUR_SECRET_KEY

Or interactively:

montaj install credentials   # select kling, enter keys

Credentials are stored in ~/.montaj/credentials.json (0600 permissions).

Credentials

KeyDescription
kling.access_keyKling API access key
kling.secret_keyKling API secret key

Get your keys at app.klingai.com.

Usage

Text-to-Video

montaj kling-generate \
  --prompt "a calico cat walking through a sunlit kitchen, cinematic" \
  --out /tmp/cat.mp4

Image-to-Video (First Frame)

montaj kling-generate \
  --prompt "slow zoom in" \
  --first-frame frame.png \
  --out /tmp/zoom.mp4

Image-to-Video (First + Last Frame)

montaj kling-generate \
  --prompt "character walks left" \
  --first-frame start.png \
  --last-frame end.png \
  --out /tmp/walk.mp4

Style Reference

montaj kling-generate \
  --prompt "same style" \
  --ref-image style1.png \
  --ref-image style2.png \
  --out /tmp/styled.mp4

Pro Mode

montaj kling-generate \
  --prompt "cinematic scene" \
  --out /tmp/pro.mp4 \
  --mode pro \
  --duration 10 \
  --aspect-ratio 9:16

Parameters

ParamDefaultDescription
--promptrequiredGeneration prompt
--outrequiredOutput file path
--first-frame <img>Starting image for image-to-video
--last-frame <img>Ending image (requires --first-frame)
--ref-image <img>Style reference image (repeatable, max 3)
--duration <3-15>Video duration in seconds
--negative-promptWhat to avoid in generation
--sound <on|off>Enable/disable sound
--aspect-ratio16:9, 9:16, 1:1
--mode <std|pro>stdStandard (cheaper/faster) or Pro (higher quality)

Model

Uses kling-v3-omni (hardcoded). The connector handles JWT authentication, request construction, polling for completion, and downloading the result.


Gemini

The Gemini connector wraps Google's Gemini API for two use cases: media analysis (video, audio, image) and image generation.

Setup

montaj install connectors
montaj install credentials --provider gemini --key api_key --value YOUR_API_KEY

Or interactively:

montaj install credentials   # select gemini, enter key

Credentials

KeyDescription
gemini.api_keyGoogle Gemini API key

Get your key at ai.google.dev.

Media Analysis (analyze_media)

Analyze any media file — video, audio, or image — with a natural language prompt.

montaj analyze-media clip.mp4 --prompt "Describe the scene in 2 sentences."

montaj analyze-media song.mp3 --prompt "Transcribe with timestamps."

montaj analyze-media photo.jpg --prompt "Return JSON: {subject, mood, dominant_colors}" --json-output

montaj analyze-media clip.mp4 --prompt "..." --model gemini-2.5-pro
ParamDefaultDescription
<input>requiredMedia file (video, audio, image)
--promptrequiredAnalysis prompt
--modelgemini-2.5-flashModel override
--json-outputRequest structured JSON from the model
--outWrite output to file

Note: Images under approximately 18 MB take a fast inline path — no Files API round-trip needed.

Image Generation (generate_image)

Generate images from text prompts, optionally conditioned on reference images.

montaj generate-image --prompt "portrait, studio lighting" --out portrait.png

montaj generate-image --prompt "same character, profile view" --ref-image portrait.png --out profile.png

montaj generate-image --prompt "..." --provider gemini --aspect-ratio 9:16 --out tall.png
ParamDefaultDescription
--promptrequiredGeneration prompt
--outrequiredOutput file path
--ref-imageReference image (repeatable)
--aspect-ratioAspect ratio (Gemini-specific)
--modelgemini-3-pro-image-previewModel override

Models

Use CaseDefault Model
Media analysisgemini-2.5-flash
Image generationgemini-3-pro-image-preview

OpenAI

The OpenAI connector wraps OpenAI's image generation API.

Setup

montaj install connectors
montaj install credentials --provider openai --key api_key --value YOUR_API_KEY

Or interactively:

montaj install credentials   # select openai, enter key

Credentials

KeyDescription
openai.api_keyOpenAI API key

Get your key at platform.openai.com.

Image Generation (generate_image)

Generate images from text prompts, optionally with reference images.

montaj generate-image --prompt "red apple on white table" --provider openai --out apple.png

montaj generate-image --prompt "same scene, sunset" --provider openai --ref-image scene.png --out sunset.png
ParamDefaultDescription
--promptrequiredGeneration prompt
--outrequiredOutput file path
--providerMust be openai to use this connector
--ref-imageReference image (repeatable)
--size <WxH>Image dimensions
--modelgpt-image-1Model override

Gemini vs. OpenAI

The generate_image step supports both Gemini and OpenAI as providers. Choose based on:

  • Gemini — supports --aspect-ratio flag, good for specific dimensions
  • OpenAI — supports --size WxH flag, different artistic style
# Gemini (default)
montaj generate-image --prompt "portrait" --out portrait.png

# OpenAI
montaj generate-image --prompt "portrait" --provider openai --out portrait.png

Credentials

API credentials for external connectors live in ~/.montaj/credentials.json with 0600 permissions.

Installation Methods

Interactive

montaj install credentials
# Prompts for provider selection and key entry

Scripted (CI/Automation)

montaj install credentials --provider kling --key access_key --value YOUR_KEY
montaj install credentials --provider kling --key secret_key --value YOUR_KEY
montaj install credentials --provider gemini --key api_key --value YOUR_KEY
montaj install credentials --provider openai --key api_key --value YOUR_KEY

Check Status

montaj install credentials --list
# Shows set/unset status per provider

Credential Precedence

Each connector reads credentials via lib.credentials.get_credential(provider, key). The precedence order:

  1. Environment variable — e.g., KLING_ACCESS_KEY, GEMINI_API_KEY
  2. Credentials file~/.montaj/credentials.json
  3. Fail — with install instructions

Supported Providers

ProviderKeysEnvironment Variables
klingaccess_key, secret_keyKLING_ACCESS_KEY, KLING_SECRET_KEY
geminiapi_keyGEMINI_API_KEY
openaiapi_keyOPENAI_API_KEY

Credentials File Format

{
  "kling": {
    "access_key": "...",
    "secret_key": "..."
  },
  "gemini": {
    "api_key": "..."
  },
  "openai": {
    "api_key": "..."
  }
}

The file is stored at ~/.montaj/credentials.json with 0600 permissions (owner read/write only).

Using in CI/CD

For CI environments, use environment variables instead of the credentials file:

export KLING_ACCESS_KEY=...
export KLING_SECRET_KEY=...
export GEMINI_API_KEY=...
export OPENAI_API_KEY=...

Environment variables take precedence over the credentials file, so this works without any additional configuration.