Skills

videoclaw ships a curated library of skills — reusable, agent-invokable workflows that either produce a video (the video category) or orchestrate the work around it (the workflow category). This doc is the comprehensive per-skill reference. For the machine-readable index see skills/catalog.json; for the full how-to of any individual skill, follow the linked SKILL.md for that skill.

Ecosystem map

Skills ecosystem map showing video-framework and brand-presenter as canonical entry points with specialist children, plus a grid of workflow skills

How skills relate

The library is not a flat bag of equally-preferred entry points. It uses a small hierarchy:

Role	Examples	When you reach for it
Canonical entry	`video-framework`, `brand-presenter`	Generic or unspecified video request — the entry skill routes into a specialist.
Specialist	`video-storyboard`, `video-clone-ad`, `movie-director`, `video-post`, ...	The mode is clearly known up front (e.g. "clone this ad", "storyboard these 6 scenes").
Compatibility alias	`davendra-presenter`, `nex-presenter`, `bunty`	Personal/brand presets that exist for discoverability — they all delegate into `brand-presenter`.
Workflow	`doctor`, `pipeline`, `worker`, `studio-mode`, ...	Orchestration, debugging, and ops — independent of any one production mode.

Rule of thumb: start at a canonical entry, specialize only when the mode is clearly known, and treat aliases as discovery handles rather than first-choice workflows.

Skill index

Group	Skill	Status	One-liner
🎯 Canonical	`video-framework`	imported	OMX-native front door that routes across copy/create/narrated/presentation/long-form/film/UGC.
🎯 Canonical	`brand-presenter`	native generic	Generic narrated presenter-video workflow over a branded host profile.
🎬 Video	`video-storyboard`	native clean-room	Brief or clone plan → scene-by-scene storyboard artifact.
🎬 Video	`video-analyze-template`	native clean-room	Reference video → reusable template packet.
🎬 Video	`video-clone-ad`	native clean-room	Saved template → new product/brand using `clone-execute`.
🎬 Video	`video-thumbnail-lab`	native clean-room	Final render → thumbnail + platform variants.
🎬 Video	`movie-director`	imported	Short-film production across 12 genres with interview/auto/hybrid entry modes.
🎬 Video	`video-replicator`	imported (deep-surface)	7-mode professional pipeline: COPY/CREATE/NARRATED/PRESENTATION/LONG-FORM/FILM/UGC.
🎬 Video	`video-post`	imported	Post-render verify, social variants, thumbnails, archival.
🎬 Video	`higgsfield-generate`	external bridge	Higgsfield CLI bridge for Marketing Studio, product photoshoots, Soul IDs, and virality scoring.
🎭 Cast	`character-creator`	imported	Create Go Bananas characters with profile + multi-view reference sheets.
🎭 Cast	`character-library`	imported	Audit, list, patch, and delete entries in the shared Go Bananas library.
🎞️ Prompts	`seedance-prompts`	imported	Browse and apply the clean-room Seedance prompt reference library.
🎞️ Prompts	`multi-shot-prompt`	native clean-room	Reference image or storyboard scene → timed multi-shot cinematic prompt sequence, validated against provider-aware videoclaw presets.
📺 Audio	`youtube-audio`	imported	Download audio (MP3) or video (MP4) from YouTube using `yt-dlp` + FFmpeg.
📣 UGC	`ugc`	imported	Belief-driven UGC campaign generator (E5 method) with multi-video output.
🎤 Aliases	`davendra-presenter` · `nex-presenter` · `bunty`	aliases	All delegate into `brand-presenter` with a personal/brand profile.
⚙️ Workflow	`worker` · `pipeline` · `studio-mode`	imported	Multi-agent orchestration and stage sequencing.
⚙️ Workflow	`doctor` · `build-fix` · `deepsearch` · `deep-interview`	imported	Diagnostics, exploration, and structured deep-dive.
⚙️ Workflow	`review`	imported	Review and cleanup-artifact governance.
⚙️ Workflow	`ai-slop-cleaner` · `configure-notifications` · `skill` · `note` · `help` · `web-clone`	imported	Operational utilities.

Generic orchestration skills that duplicated the operator's global plugin set (autopilot, ralph/ralph-init/ralplan, team, cancel, trace, hud, git-master, code-review, security-review) and the removed-tooling omx-setup were culled from the repo — use the global versions.

🎯 Canonical entries

video-framework

Role: OMX-native front door for any "make a video" request. What it does: Routes the request across the seven established workflows — COPY, CREATE, NARRATED, PRESENTATION, LONG-FORM, FILM, UGC — by classifying the intent and reusing proven legacy engines behind clean adapter boundaries. Picks the right specialist instead of forcing the user to. Key features:

Single intake surface for both clone-style and from-scratch video requests
Adapter pattern preserves legacy engine quality without inheriting legacy mess
Hands off to a specialist (storyboard, clone-ad, movie-director, replicator, Higgsfield bridge, ugc, ...) once the mode is decided
Useful as the default first-touch when the user's intent is ambiguous

When to reach for it: Any open-ended video request — "I want to make a video", "can you do a video for X?" — where the production mode hasn't been picked yet.

Full guide: skills/video-framework/SKILL.md

brand-presenter

Role: Canonical (generic) presenter-video workflow. What it does: Turns a slide deck or structured topic into an intro/slides/outro narrated presentation using a branded host profile (avatar + voice + intro/outro framing). Personal/brand presenter skills (davendra-presenter, nex-presenter, bunty) all delegate here with a different host profile. Key features:

One generic workflow with swappable brand profiles (no copy-paste forks)
Slide-deck-aware framing (cover slide → body → call-to-action)
Lip-synced intro/outro plus TTS narration over body slides
Works for product explainers, internal updates, social-first brand cuts

When to reach for it: Anything narrated and host-led — explainers, demos, brand intros, presentation videos. Pick a personal alias instead if a specific host identity is required.

Full guide: skills/brand-presenter/SKILL.md

🎬 Video specialists

video-storyboard

Role: Brief or clone plan → explicit scene-by-scene storyboard artifact. What it does: Generates a storyboard.json artifact with optional character-to-scene bindings. Scenes can come from raw --scene strings or from a registered storyboard template; characters can be bound per-scene with --scene-character <sceneIndex:name>. Key features:

Mode-aware (storyboard vs director) so the right pipeline manifest applies
Storyboard-template aware — supports parameterised templates (environment, character A/B)
Per-scene character binding flows into character-consistency enforcement
Output is canonical JSON and validates against schemas/video/

When to reach for it: "storyboard this brief", "turn this plan into scenes", "assign characters to scenes".

Full guide: skills/video-storyboard/SKILL.md

video-analyze-template

Role: Reference video → reusable template packet. What it does: Analyzes a source video (path or URL) and writes a normalized analyze-output.json that can be saved as a reusable template via template-save. With --auto, drives the analysis through the Gemini key pool to fill pacing, beats, keep/change guidance, and reusable variables automatically. Key features:

Manual mode (operator-driven beats/keeps/changes) and auto mode (Gemini-backed)
Round-robin Gemini key rotation with per-key cooldown for resilient analysis
Endpoint override via VCLAW_GEMINI_API_ENDPOINT for local Gemini-compatible targets
Output composes directly into template-save → clone-plan → storyboard-from-clone

When to reach for it: "analyze this video style", "break this ad into reusable structure", "turn this reference into a template".

Full guide: skills/video-analyze-template/SKILL.md

video-clone-ad

Role: Saved template → new product/brand via the canonical clone-execute flow. What it does: Adapts a known template to a new intent while preserving execution structure (scene count, motion, pacing). Drives the clone-plan → storyboard-from-clone → execution-seed → execute chain in one logical workflow. Key features:

Template-driven so structural quality is reused, not re-derived per project
Mode-aware (storyboard for fast iteration; director for full approval-gated runs)
Execution-profile carrier (aspect-ratio, quality, resolution, audio, outputs) flows into the brief
--dry-run lets you validate the payload shape before any provider submission

When to reach for it: "clone this ad", "adapt this launch ad to a new product", "reuse this template for a new campaign".

Full guide: skills/video-clone-ad/SKILL.md

video-thumbnail-lab

Role: Final render → click-driving still + platform packaging pass. What it does: Generates thumbnails and platform-specific variants for a finished render. Drives make-vertical, make-square, make-loop, and thumbnail over a finished --project or stand-alone --file. Key features:

Project-aware (works against the canonical asset trail) and file-mode (works against any local mp4)
Optional --text <title> for a simple overlay thumbnail
Bundled vertical/square/loop variant helpers for cross-platform delivery
Output naming and locations follow the canonical project layout

When to reach for it: "generate a thumbnail for this render", "make square and vertical promo cuts", "package this final video for YouTube/Shorts/social".

Full guide: skills/video-thumbnail-lab/SKILL.md

movie-director

Role: Short-film production across 12 genres with structured entry modes. What it does: End-to-end movie production via VideoClaw Director mode. Supports interview-driven, auto-mode, or CLI-hybrid entry. Covers action-thriller, storybook, documentary, UGC-ad, music-video, romance, horror, sci-fi, fantasy, western, short-film, and custom. Bundles cast building, style/color presets, Seedance-safe prompt engineering with content-filter auto-fix, multi-key Gemini rotation, and the storyboard-review gate. Key features:

10 style presets × 9 color gradings × 12 genres
Cast building via Go Bananas library lookup or auto-creation from a JSON seed
Content-filter auto-fix for Seedance-safe prompts
Bundled scripts: verification, interview, auto-mode, cost estimation, iteration, narrated re-mux

When to reach for it: Cinematic, narrative, or multi-genre film work where the bundled genre material and entry-mode structure pays off.

Full guide: skills/movie-director/SKILL.md

video-replicator

Role: Deep-reference 7-mode professional video production pipeline. Status: Deprecated for user-facing use — superseded by video-framework (catalog status deprecated-reference). Reach for it only when the canonical entry routes you here. What it does: The legacy comprehensive pipeline: COPY (replicate/clone with subject swap), CREATE (original from scratch), COPY NARRATED (replicate with continuous voiceover), PRESENTATION (slides to animated video), LONG-FORM (10+ minute, 20+ scene batches), FILM (full cinematic with screenplay), or UGC CAMPAIGN. Kept as a deep-surface reference behind video-framework. Key features:

7 distinct production modes covering most real-world video asks
SEALCAM+ video analysis for COPY workflows
Long-form batch generation across 20+ scenes
Image-to-video and text-to-video both supported through the same surface

When to reach for it: When the canonical entry has routed you here, or when an existing legacy workflow needs the deeper reference. Otherwise prefer video-framework first.

NOT for: single image generation, FFmpeg-only scripts without the pipeline, video-player debugging, or static slide deck creation without video output.

Full guide: skills/video-replicator/SKILL.md

video-post

Role: Post-render verification, variants, thumbnails, and archival for clean-room outputs. What it does: Closes the loop after render. Verifies final outputs (codec/resolution/duration/audio presence/midpoint frame), creates social variants (vertical / square / loop), extracts thumbnails, and archives finished projects into a tarball with optional cleanup. Key features:

verify-final probes structural correctness of the render
make-vertical / make-square / make-loop for platform variants
thumbnail with optional --text overlay
archive-project packages a finished project as archives/<slug>-<timestamp>.tar.gz

When to reach for it: Anything that happens after render — verification, packaging, distribution prep, archival.

Full guide: skills/video-post/SKILL.md

character-creator

Role: New Go Bananas character creation with reference sheets. What it does: Creates Go Bananas characters with profile images and multi-view reference sheets so the same character can be regenerated consistently across scenes. Inputs feed the character-consistency subsystem and the per-project characters/characters.json store. Key features:

Profile image plus multi-view (front / 3/4 / side / back) reference sheet generation
Output binds into project character profiles for downstream scene-character mapping
Companion to character-library (creator owns creation; library owns audit/patch/delete)
Triggers on natural-language asks like "create a character", "new character with reference sheet"

When to reach for it: "create a character", "design a character", "build a character reference", "set up characters", "new character with reference sheet".

Full guide: skills/character-creator/SKILL.md

character-library

Role: Audit and hygiene for the shared Go Bananas character library. What it does: Browses the library, flags polluted entries, patches base prompts in place, and deletes bad anchors — without leaving the repo-local skill surface. Companion to character-creator. Key features:

library find for exact-name discovery from intent text
library clean with dry-run candidate discovery (by ids, name regex, or bloated prompt size)
In-place prompt patching for a single character without recreating it
Drives the vclaw video library CLI surface

When to reach for it: "list my characters", "audit the character library", "patch this drifting character", "delete polluted characters", "fix library hygiene before a director run".

Full guide: skills/character-library/SKILL.md

seedance-prompts

Role: Reference library and prompt-quality assistant for Seedance-targeted scene writing. What it does: Browses the clean-room Seedance prompt reference library and applies current provider guidance to Seedance prompt writing. Built on the actual prompt-lib-list / prompt-lib-show surface. Key features:

Searchable Seedance formulas, examples, and prompt-structure guidance
Backed by prompt-library references that actually exist in this repo (no hallucinated examples)
Triggers on "seedance prompt", "expand prompt for seedance", "prompt quality"
Output composes into storyboard and execute flows

When to reach for it: When you need Seedance-specific prompt help, formulas, or examples.

Full guide: skills/seedance-prompts/SKILL.md

multi-shot-prompt

Role: Reference image → timed multi-shot cinematic prompt sequence. What it does: Generates structured multi-shot video prompts from a reference image, output as a timed shot sequence validated against the videoclaw cinematic-15s preset. Drives the real CLI: vclaw video multi-shot --plan to scaffold the shot structure, author cinematic prose per shot, then vclaw video multi-shot --validate to enforce the hard rules. An --auto --image <path> path runs the full sequence without manual authoring steps. Existing projects can use --from-storyboard --project <slug> --scene <sceneIndex> to hydrate action, characters, location defaults, and source metadata from the storyboard artifact. Key features:

Timed shot sequences anchored to the cinematic-15s preset with hard-rule validation
Image-grounded prompting — reference image drives visual continuity across shots
Manual (scaffold → author → validate) and automated (--auto --image) entry paths
Machine-readable preset discovery, provider-shaped defaults, issue explanations, and parsed shots[] artifacts for downstream agents
Prompt-library reference via vclaw video prompt-lib-show --name multi-shot-framework

When to reach for it: "multi-shot prompt", "shot sequence", "cinematic prompt", "video prompt from this image", "shot breakdown".

Full guide: skills/multi-shot-prompt/SKILL.md

higgsfield-generate

Role: External-provider bridge for Higgsfield AI generation. What it does: Reuses the public MIT-licensed higgsfield-ai/skills command intelligence as a thin videoclaw skill. It routes agents to the official higgsfield CLI for Marketing Studio ad videos, product photoshoots, marketplace cards, Soul Character identity training, generic image/video generation, and finished-video virality scoring. Key features:

Keeps higgsfield optional rather than making it a videoclaw dependency
Uses Higgsfield's dedicated product-photoshoot and Marketing Studio commands instead of generic prompt guessing
Gives agents a clean hand-off rule: standalone Higgsfield URLs directly, project-tracked outputs through existing videoclaw artifacts
Captures the upstream source commit and MIT reuse boundary in the skill itself

When to reach for it: "use Higgsfield", "make a UGC ad in Marketing Studio", "product photoshoot", "train Soul ID", "score this video's virality".

Full guide: skills/higgsfield-generate/SKILL.md

youtube-audio

Role: YouTube → MP3 audio or MP4 video using yt-dlp + FFmpeg. What it does: Downloads audio or video from YouTube videos and playlists. Supports trimming, resolution selection, and audio quality settings. Key features:

Single videos, playlists, or batch URLs
Audio-only, video-only, or both in one pass
Trim to clip ranges
Requires yt-dlp and ffmpeg

When to reach for it: "download audio from this YouTube video", "grab the MP4", "extract music from a playlist".

Full guide: skills/youtube-audio/SKILL.md

ugc

Role: Belief-driven UGC campaign generator using the E5 method. What it does: Generates a multi-video belief-targeted UGC campaign (30–60s each) from a product URL and intent. Produces N videos with subtitles plus a campaign report. Key features:

Belief-journey decomposition (E5 method: Examine, Educate, Emote, Evidence, Empower)
Multi-video campaign output rather than single-clip
Per-video subtitles and aggregate campaign report
Triggers on "UGC campaign", "belief-driven ads", "E5 method"

When to reach for it: When the goal is a marketing campaign rather than a single creative video.

Full guide: skills/ugc/SKILL.md

🎤 Video aliases

These exist for discoverability and personal/brand handoff — they all delegate into brand-presenter with a different host profile. Treat them as compatibility surfaces; the canonical workflow lives in brand-presenter.

Alias	Profile	Trigger
`davendra-presenter`	Davendra (asset/voice)	"davendra video", "davendra presenter"
`nex-presenter`	Nex (asset/voice)	"nex video", "nex presenter"
`bunty`	Bunty — cricket commentator (orange blazer)	"bunty thing", "match day analysis", "cricket scorecard video"

⚙️ Workflow skills

Workflow skills are independent of any one production mode. They orchestrate, debug, review, or operate on top of the production layer.

Multi-agent orchestration

worker

Team-worker protocol (ACK, mailbox, task lifecycle) for tmux-based teams. Pairs with team. Full guide

pipeline

Configurable pipeline orchestrator for sequencing stages — useful when you need explicit per-stage control rather than a fully autonomous run. Full guide

studio-mode

Agent-driven video production with interview → consensus plan → user approval → credit spend. Slower-but-safer alternative to the fast one-shot vclaw video create. Triggered by "$studio" requests. Full guide

Diagnostics & exploration

doctor

Diagnose and fix oh-my-codex installation issues. Full guide

build-fix

Fix build and TypeScript errors with minimal changes. Conservative — does not refactor. Full guide

deepsearch

Thorough codebase search. Use when grep/glob isn't enough and you need an agent to actually understand the codebase as it searches. Full guide

deep-interview

Socratic deep interview with mathematical ambiguity gating before execution. Forces requirements clarity before any work begins. Full guide

Review & governance

review

Reviewer-only pass for /plan --review and cleanup artifact review. Full guide

Operational utilities

ai-slop-cleaner

Run an anti-slop cleanup/refactor/deslop workflow. Removes the kinds of half-baked artifacts that unsupervised agents leave behind. Full guide

configure-notifications

Configure OMX notifications — unified entry point for all notification platforms. Full guide

skill

Manage local skills — list, add, remove, search, edit, setup wizard. Full guide

note

Save notes to notepad.md for compaction resilience across long agent runs. Full guide

help

Guide on using VideoClaw — the in-product help surface. Full guide

web-clone

URL-driven website cloning with visual + functional verification. Full guide

Adding or modifying a skill

The skill set is curated, not auto-generated. To add or modify a skill:

Add or edit skills/<name>/SKILL.md — the canonical guide for the skill
Update skills/catalog.json with the skill's id, category, status, and any specializes / aliasOf / specializations relationships
Add a section here in docs/SKILLS.md and link it from the index table
Run npm run check:skill-frontdoor to verify the repo-local skill front door stays consistent
Run npm run check:cleanroom-docs to verify clean-room-facing docs and skills don't reference stale paths

Skills ​

Ecosystem map ​

How skills relate ​

Skill index ​

🎯 Canonical entries ​

video-framework ​

brand-presenter ​

🎬 Video specialists ​

video-storyboard ​

video-analyze-template ​

video-clone-ad ​

video-thumbnail-lab ​

movie-director ​

video-replicator ​

video-post ​

character-creator ​

character-library ​

seedance-prompts ​

multi-shot-prompt ​

higgsfield-generate ​

youtube-audio ​

ugc ​

🎤 Video aliases ​

⚙️ Workflow skills ​

Multi-agent orchestration ​

worker ​

pipeline ​

studio-mode ​

Diagnostics & exploration ​

doctor ​

build-fix ​

deepsearch ​

deep-interview ​

Review & governance ​

review ​

Operational utilities ​

ai-slop-cleaner ​

configure-notifications ​

skill ​

note ​

help ​

web-clone ​

Adding or modifying a skill ​

See also ​