API Reference

Complete reference for all 19 tools. Use these via the AI assistant, CLI, or workflow templates.

composition

Add, update, and delete scenes in your video composition

Add Scene

CLIAgent
addScene

Add a new code scene with React/Remotion component code to the composition. All visual scenes are code_scene type.

Parameters

titlestringrequired

Scene title for identification in timeline

codestringrequired

Complete React/Remotion component code

durationnumberrequired

Scene duration in seconds

startnumberoptional

Start time in seconds (defaults to 0)

Request

{
  "title": "Intro",
  "code": "export default function Intro() {\n  return (\n    <AbsoluteFill style={{ backgroundColor: '#000' }}>\n      <h1>Welcome</h1>\n    </AbsoluteFill>\n  );\n}",
  "duration": 5
}

Response

{
  "success": true,
  "sceneId": "scene_abc123"
}

Delete Scene

CLIAgent
deleteScene

Delete a specific scene from the composition by its ID

Parameters

sceneIdstringrequired

ID of the scene to delete

Request

{
  "sceneId": "scene_abc123"
}

Response

{
  "success": true
}

Delete All Scenes

CLIAgent
deleteAllScenes

Delete ALL scenes from the composition to start fresh

No parameters required.

Request

{}

Response

{
  "success": true,
  "deletedCount": 5
}

Set Metadata

CLIAgent
setMetadata

Update the composition's metadata like title, description, duration, and dimensions

Parameters

titlestringoptional

Video title

descriptionstringoptional

Video description

scriptstringoptional

Video script or narrative

durationnumberoptional

Total duration in seconds

fpsnumberoptional

Frames per second

widthnumberoptional

Width in pixels

heightnumberoptional

Height in pixels

Request

{
  "title": "Product Demo",
  "fps": 30,
  "width": 1920,
  "height": 1080
}

Response

{
  "success": true
}

Update Scene

CLIAgent
updateScene

Update an existing scene's code by providing new code directly or AI-powered instructions

Parameters

sceneIdstringrequired

ID of the scene to update

codestringoptional

Updated React/Remotion component code

updateInstructionsstringoptional

Instructions for AI to modify the scene

titlestringoptional

New title for the scene

Request

{
  "sceneId": "scene_abc123",
  "updateInstructions": "Make text larger"
}

Response

{
  "success": true
}

Update Scene Timing

Agent
updateSceneTiming

Update a scene's duration or start time with automatic adjustment of subsequent scenes

Parameters

sceneIdstringrequired

ID of the scene

durationnumberoptional

New duration in seconds

startnumberoptional

New start time in seconds

adjustSubsequentScenesbooleanoptional

Auto-adjust subsequent scenes

Request

{
  "sceneId": "scene_abc123",
  "duration": 8
}

Response

{
  "success": true
}

query

Read and inspect the current state of your composition

Assess Composition

CLIAgent
assessComposition

Get a lightweight overview of the composition showing all scenes with their metadata and status (token-efficient)

No parameters required.

Request

{}

Response

{
  "title": "Demo",
  "duration": 45,
  "sceneCount": 6
}

Read Composition

CLIAgent
readComposition

Read the current composition to see its metadata and planned scenes

No parameters required.

Request

{}

Response

{
  "id": "comp_xyz",
  "title": "Demo",
  "scenes": []
}

generation

Generate content using AI—speech, visuals, and transcription

Generate Speech

CLIAgent
generateSpeech

Generate speech audio from text and add as audio scene. Requires a model ID - call listSpeechModels first to get available options.

Parameters

textstringrequired

The text to convert to speech

modelstringrequired

TTS model ID

sceneIdstringoptional

Existing audio scene ID to update

startnumberoptional

Start time in seconds

durationnumberoptional

Duration override

voicestringoptional

Voice ID

Request

{
  "text": "Welcome to our demo.",
  "model": "eleven_labs/rachel"
}

Response

{
  "success": true,
  "sceneId": "audio_123",
  "duration": 3.5
}

Generate Scene Content

Agent
generateSceneContent

Generate actual content for a scene based on its type and metadata. For code_scene: generates React/Remotion code. Automatically marks scene as 'final' when done.

Parameters

sceneIdstringrequired

ID of the scene to generate content for

additionalInstructionsstringoptional

Additional instructions

Request

{
  "sceneId": "scene_abc123"
}

Response

{
  "success": true
}

Generate Video

CLIAgent
generateVideo

Generate a video from a text prompt and add as video scene. Requires a model ID - call listVideoModels first to get available options.

Parameters

promptstringrequired

Text prompt describing the video to generate

modelstringrequired

Video generation model ID (use listVideoModels to see options)

sceneIdstringoptional

Existing video scene ID to update

startnumberoptional

Start time in seconds

modelInputsobjectoptional

Model-specific options like num_frames, fps, aspect_ratio

Request

{
  "prompt": "Aerial drone shot of mountains at sunset",
  "model": "minimax/video-01"
}

Response

{
  "success": true,
  "sceneId": "video_123",
  "duration": 5
}

Generate Image

CLIAgent
generateImage

Generate an image from a text prompt and add as image scene. Requires a model ID - call listImageModels first to get available options.

Parameters

promptstringrequired

Text prompt describing the image to generate

modelstringrequired

Image generation model ID (use listImageModels to see options)

sceneIdstringoptional

Existing scene ID to update

startnumberoptional

Start time in seconds

durationnumberoptional

Duration of image scene in seconds (default: 5)

modelInputsobjectoptional

Model-specific options like width, height, num_inference_steps

Request

{
  "prompt": "Minimalist logo on dark background",
  "model": "black-forest-labs/flux-1.1-pro"
}

Response

{
  "success": true,
  "sceneId": "img_123",
  "duration": 5
}

Transcribe Audio

CLIAgent
transcribeAudio

Transcribe audio to text with word-level timestamps

Parameters

audioUrlstringrequired

URL of the audio file

modelstringoptional

STT model (defaults to openai/whisper)

Request

{
  "audioUrl": "https://example.com/audio.mp3"
}

Response

{
  "text": "Hello world",
  "duration": 2.1
}

planning

Plan scenes and execute workflow templates

Plan Scene

Agent
planScene

Plan a scene scaffold. For audio_scene scenes, creates a placeholder that generateSpeech will fill in - use description field for the text to speak. Returns sceneId that should be passed to generateSpeech.

Parameters

typeenumrequired

code_scene, video_file, audio_only, or transition

startnumberrequired

Start time in seconds

durationnumberrequired

Duration in seconds

titlestringrequired

Scene title

descriptionstringrequired

Detailed brief for the scene

Request

{
  "type": "code_scene",
  "start": 0,
  "duration": 5,
  "title": "Intro",
  "description": "Animated intro"
}

Response

{
  "success": true,
  "sceneId": "scene_abc123"
}

Run Template

Agent
runTemplate

Execute a workflow template. Only available when organization context is provided.

Parameters

templateIdstringrequired

ID of the template to execute

inputsobjectoptional

Template workflow inputs

Request

{
  "templateId": "template_123",
  "inputs": {
    "topic": "AI"
  }
}

Response

{
  "success": true,
  "executionId": "exec_xyz"
}

utility

Helper tools for discovery and configuration

List Speech Models

CLIAgent
listSpeechModels

List available text-to-speech models and their voices. Call this before generating speech to see which models and voices are available.

No parameters required.

Request

{}

Response

{
  "models": [
    {
      "id": "eleven_labs/rachel",
      "name": "Rachel"
    }
  ]
}

List Video Models

CLIAgent
listVideoModels

List available video generation models. Call this before generating video to see which models are available and their capabilities.

No parameters required.

Request

{}

Response

{
  "models": [
    {
      "id": "minimax/video-01",
      "name": "Video-01"
    }
  ]
}

List Image Models

CLIAgent
listImageModels

List available image generation models. Call this before generating images to see which models are available and their capabilities.

No parameters required.

Request

{}

Response

{
  "models": [
    {
      "id": "black-forest-labs/flux-1.1-pro",
      "name": "FLUX 1.1 Pro"
    }
  ]
}

Upload File

CLIAgent
uploadFile

Upload a local file payload to Convex storage for use in composition scenes

No parameters required.