Skills
videoclaw ships a curated library of skills — reusable, agent-invokable workflows that either produce a video (the video category) or orchestrate the work around it (the workflow category). This doc is the comprehensive per-skill reference. For the machine-readable index see skills/catalog.json; for the full how-to of any individual skill, follow the linked SKILL.md for that skill.
Ecosystem map

How skills relate
The library is not a flat bag of equally-preferred entry points. It uses a small hierarchy:
| Role | Examples | When you reach for it |
|---|---|---|
| Canonical entry | video-framework, brand-presenter | Generic or unspecified video request — the entry skill routes into a specialist. |
| Specialist | video-storyboard, video-clone-ad, movie-director, video-post, ... | The mode is clearly known up front (e.g. "clone this ad", "storyboard these 6 scenes"). |
| Compatibility alias | davendra-presenter, nex-presenter, bunty | Personal/brand presets that exist for discoverability — they all delegate into brand-presenter. |
| Workflow | doctor, pipeline, worker, studio-mode, ... | Orchestration, debugging, and ops — independent of any one production mode. |
Rule of thumb: start at a canonical entry, specialize only when the mode is clearly known, and treat aliases as discovery handles rather than first-choice workflows.
Skill index
| Group | Skill | Status | One-liner |
|---|---|---|---|
| 🎯 Canonical | video-framework | imported | OMX-native front door that routes across copy/create/narrated/presentation/long-form/film/UGC. |
| 🎯 Canonical | brand-presenter | native generic | Generic narrated presenter-video workflow over a branded host profile. |
| 🎬 Video | video-storyboard | native clean-room | Brief or clone plan → scene-by-scene storyboard artifact. |
| 🎬 Video | video-analyze-template | native clean-room | Reference video → reusable template packet. |
| 🎬 Video | video-clone-ad | native clean-room | Saved template → new product/brand using clone-execute. |
| 🎬 Video | video-thumbnail-lab | native clean-room | Final render → thumbnail + platform variants. |
| 🎬 Video | movie-director | imported | Short-film production across 12 genres with interview/auto/hybrid entry modes. |
| 🎬 Video | video-replicator | imported (deep-surface) | 7-mode professional pipeline: COPY/CREATE/NARRATED/PRESENTATION/LONG-FORM/FILM/UGC. |
| 🎬 Video | video-post | imported | Post-render verify, social variants, thumbnails, archival. |
| 🎬 Video | higgsfield-generate | external bridge | Higgsfield CLI bridge for Marketing Studio, product photoshoots, Soul IDs, and virality scoring. |
| 🎭 Cast | character-creator | imported | Create Go Bananas characters with profile + multi-view reference sheets. |
| 🎭 Cast | character-library | imported | Audit, list, patch, and delete entries in the shared Go Bananas library. |
| 🎞️ Prompts | seedance-prompts | imported | Browse and apply the clean-room Seedance prompt reference library. |
| 🎞️ Prompts | multi-shot-prompt | native clean-room | Reference image or storyboard scene → timed multi-shot cinematic prompt sequence, validated against provider-aware videoclaw presets. |
| 📺 Audio | youtube-audio | imported | Download audio (MP3) or video (MP4) from YouTube using yt-dlp + FFmpeg. |
| 📣 UGC | ugc | imported | Belief-driven UGC campaign generator (E5 method) with multi-video output. |
| 🎤 Aliases | davendra-presenter · nex-presenter · bunty | aliases | All delegate into brand-presenter with a personal/brand profile. |
| ⚙️ Workflow | worker · pipeline · studio-mode | imported | Multi-agent orchestration and stage sequencing. |
| ⚙️ Workflow | doctor · build-fix · deepsearch · deep-interview | imported | Diagnostics, exploration, and structured deep-dive. |
| ⚙️ Workflow | review | imported | Review and cleanup-artifact governance. |
| ⚙️ Workflow | ai-slop-cleaner · configure-notifications · skill · note · help · web-clone | imported | Operational utilities. |
Generic orchestration skills that duplicated the operator's global plugin set (
autopilot,ralph/ralph-init/ralplan,team,cancel,trace,hud,git-master,code-review,security-review) and the removed-toolingomx-setupwere culled from the repo — use the global versions.
🎯 Canonical entries
video-framework
Role: OMX-native front door for any "make a video" request. What it does: Routes the request across the seven established workflows — COPY, CREATE, NARRATED, PRESENTATION, LONG-FORM, FILM, UGC — by classifying the intent and reusing proven legacy engines behind clean adapter boundaries. Picks the right specialist instead of forcing the user to. Key features:
- Single intake surface for both clone-style and from-scratch video requests
- Adapter pattern preserves legacy engine quality without inheriting legacy mess
- Hands off to a specialist (storyboard, clone-ad, movie-director, replicator, Higgsfield bridge, ugc, ...) once the mode is decided
- Useful as the default first-touch when the user's intent is ambiguous
When to reach for it: Any open-ended video request — "I want to make a video", "can you do a video for X?" — where the production mode hasn't been picked yet.
Full guide: skills/video-framework/SKILL.md
brand-presenter
Role: Canonical (generic) presenter-video workflow. What it does: Turns a slide deck or structured topic into an intro/slides/outro narrated presentation using a branded host profile (avatar + voice + intro/outro framing). Personal/brand presenter skills (davendra-presenter, nex-presenter, bunty) all delegate here with a different host profile. Key features:
- One generic workflow with swappable brand profiles (no copy-paste forks)
- Slide-deck-aware framing (cover slide → body → call-to-action)
- Lip-synced intro/outro plus TTS narration over body slides
- Works for product explainers, internal updates, social-first brand cuts
When to reach for it: Anything narrated and host-led — explainers, demos, brand intros, presentation videos. Pick a personal alias instead if a specific host identity is required.
Full guide: skills/brand-presenter/SKILL.md
🎬 Video specialists
video-storyboard
Role: Brief or clone plan → explicit scene-by-scene storyboard artifact. What it does: Generates a storyboard.json artifact with optional character-to-scene bindings. Scenes can come from raw --scene strings or from a registered storyboard template; characters can be bound per-scene with --scene-character <sceneIndex:name>. Key features:
- Mode-aware (
storyboardvsdirector) so the right pipeline manifest applies - Storyboard-template aware — supports parameterised templates (environment, character A/B)
- Per-scene character binding flows into character-consistency enforcement
- Output is canonical JSON and validates against
schemas/video/
When to reach for it: "storyboard this brief", "turn this plan into scenes", "assign characters to scenes".
Full guide: skills/video-storyboard/SKILL.md
video-analyze-template
Role: Reference video → reusable template packet. What it does: Analyzes a source video (path or URL) and writes a normalized analyze-output.json that can be saved as a reusable template via template-save. With --auto, drives the analysis through the Gemini key pool to fill pacing, beats, keep/change guidance, and reusable variables automatically. Key features:
- Manual mode (operator-driven beats/keeps/changes) and auto mode (Gemini-backed)
- Round-robin Gemini key rotation with per-key cooldown for resilient analysis
- Endpoint override via
VCLAW_GEMINI_API_ENDPOINTfor local Gemini-compatible targets - Output composes directly into
template-save→clone-plan→storyboard-from-clone
When to reach for it: "analyze this video style", "break this ad into reusable structure", "turn this reference into a template".
Full guide: skills/video-analyze-template/SKILL.md
video-clone-ad
Role: Saved template → new product/brand via the canonical clone-execute flow. What it does: Adapts a known template to a new intent while preserving execution structure (scene count, motion, pacing). Drives the clone-plan → storyboard-from-clone → execution-seed → execute chain in one logical workflow. Key features:
- Template-driven so structural quality is reused, not re-derived per project
- Mode-aware (
storyboardfor fast iteration;directorfor full approval-gated runs) - Execution-profile carrier (aspect-ratio, quality, resolution, audio, outputs) flows into the brief
--dry-runlets you validate the payload shape before any provider submission
When to reach for it: "clone this ad", "adapt this launch ad to a new product", "reuse this template for a new campaign".
Full guide: skills/video-clone-ad/SKILL.md
video-thumbnail-lab
Role: Final render → click-driving still + platform packaging pass. What it does: Generates thumbnails and platform-specific variants for a finished render. Drives make-vertical, make-square, make-loop, and thumbnail over a finished --project or stand-alone --file. Key features:
- Project-aware (works against the canonical asset trail) and file-mode (works against any local mp4)
- Optional
--text <title>for a simple overlay thumbnail - Bundled vertical/square/loop variant helpers for cross-platform delivery
- Output naming and locations follow the canonical project layout
When to reach for it: "generate a thumbnail for this render", "make square and vertical promo cuts", "package this final video for YouTube/Shorts/social".
Full guide: skills/video-thumbnail-lab/SKILL.md
movie-director
Role: Short-film production across 12 genres with structured entry modes. What it does: End-to-end movie production via VideoClaw Director mode. Supports interview-driven, auto-mode, or CLI-hybrid entry. Covers action-thriller, storybook, documentary, UGC-ad, music-video, romance, horror, sci-fi, fantasy, western, short-film, and custom. Bundles cast building, style/color presets, Seedance-safe prompt engineering with content-filter auto-fix, multi-key Gemini rotation, and the storyboard-review gate. Key features:
- 10 style presets × 9 color gradings × 12 genres
- Cast building via Go Bananas library lookup or auto-creation from a JSON seed
- Content-filter auto-fix for Seedance-safe prompts
- Bundled scripts: verification, interview, auto-mode, cost estimation, iteration, narrated re-mux
When to reach for it: Cinematic, narrative, or multi-genre film work where the bundled genre material and entry-mode structure pays off.
Full guide: skills/movie-director/SKILL.md
video-replicator
Role: Deep-reference 7-mode professional video production pipeline. Status: Deprecated for user-facing use — superseded by video-framework (catalog status deprecated-reference). Reach for it only when the canonical entry routes you here. What it does: The legacy comprehensive pipeline: COPY (replicate/clone with subject swap), CREATE (original from scratch), COPY NARRATED (replicate with continuous voiceover), PRESENTATION (slides to animated video), LONG-FORM (10+ minute, 20+ scene batches), FILM (full cinematic with screenplay), or UGC CAMPAIGN. Kept as a deep-surface reference behind video-framework. Key features:
- 7 distinct production modes covering most real-world video asks
- SEALCAM+ video analysis for COPY workflows
- Long-form batch generation across 20+ scenes
- Image-to-video and text-to-video both supported through the same surface
When to reach for it: When the canonical entry has routed you here, or when an existing legacy workflow needs the deeper reference. Otherwise prefer video-framework first.
NOT for: single image generation, FFmpeg-only scripts without the pipeline, video-player debugging, or static slide deck creation without video output.
Full guide: skills/video-replicator/SKILL.md
video-post
Role: Post-render verification, variants, thumbnails, and archival for clean-room outputs. What it does: Closes the loop after render. Verifies final outputs (codec/resolution/duration/audio presence/midpoint frame), creates social variants (vertical / square / loop), extracts thumbnails, and archives finished projects into a tarball with optional cleanup. Key features:
verify-finalprobes structural correctness of the rendermake-vertical/make-square/make-loopfor platform variantsthumbnailwith optional--textoverlayarchive-projectpackages a finished project asarchives/<slug>-<timestamp>.tar.gz
When to reach for it: Anything that happens after render — verification, packaging, distribution prep, archival.
Full guide: skills/video-post/SKILL.md
character-creator
Role: New Go Bananas character creation with reference sheets. What it does: Creates Go Bananas characters with profile images and multi-view reference sheets so the same character can be regenerated consistently across scenes. Inputs feed the character-consistency subsystem and the per-project characters/characters.json store. Key features:
- Profile image plus multi-view (front / 3/4 / side / back) reference sheet generation
- Output binds into project character profiles for downstream scene-character mapping
- Companion to
character-library(creator owns creation; library owns audit/patch/delete) - Triggers on natural-language asks like "create a character", "new character with reference sheet"
When to reach for it: "create a character", "design a character", "build a character reference", "set up characters", "new character with reference sheet".
Full guide: skills/character-creator/SKILL.md
character-library
Role: Audit and hygiene for the shared Go Bananas character library. What it does: Browses the library, flags polluted entries, patches base prompts in place, and deletes bad anchors — without leaving the repo-local skill surface. Companion to character-creator. Key features:
library findfor exact-name discovery from intent textlibrary cleanwith dry-run candidate discovery (by ids, name regex, or bloated prompt size)- In-place prompt patching for a single character without recreating it
- Drives the
vclaw video libraryCLI surface
When to reach for it: "list my characters", "audit the character library", "patch this drifting character", "delete polluted characters", "fix library hygiene before a director run".
Full guide: skills/character-library/SKILL.md
seedance-prompts
Role: Reference library and prompt-quality assistant for Seedance-targeted scene writing. What it does: Browses the clean-room Seedance prompt reference library and applies current provider guidance to Seedance prompt writing. Built on the actual prompt-lib-list / prompt-lib-show surface. Key features:
- Searchable Seedance formulas, examples, and prompt-structure guidance
- Backed by prompt-library references that actually exist in this repo (no hallucinated examples)
- Triggers on "seedance prompt", "expand prompt for seedance", "prompt quality"
- Output composes into
storyboardandexecuteflows
When to reach for it: When you need Seedance-specific prompt help, formulas, or examples.
Full guide: skills/seedance-prompts/SKILL.md
multi-shot-prompt
Role: Reference image → timed multi-shot cinematic prompt sequence. What it does: Generates structured multi-shot video prompts from a reference image, output as a timed shot sequence validated against the videoclaw cinematic-15s preset. Drives the real CLI: vclaw video multi-shot --plan to scaffold the shot structure, author cinematic prose per shot, then vclaw video multi-shot --validate to enforce the hard rules. An --auto --image <path> path runs the full sequence without manual authoring steps. Existing projects can use --from-storyboard --project <slug> --scene <sceneIndex> to hydrate action, characters, location defaults, and source metadata from the storyboard artifact. Key features:
- Timed shot sequences anchored to the
cinematic-15spreset with hard-rule validation - Image-grounded prompting — reference image drives visual continuity across shots
- Manual (scaffold → author → validate) and automated (
--auto --image) entry paths - Machine-readable preset discovery, provider-shaped defaults, issue explanations, and parsed
shots[]artifacts for downstream agents - Prompt-library reference via
vclaw video prompt-lib-show --name multi-shot-framework
When to reach for it: "multi-shot prompt", "shot sequence", "cinematic prompt", "video prompt from this image", "shot breakdown".
Full guide: skills/multi-shot-prompt/SKILL.md
higgsfield-generate
Role: External-provider bridge for Higgsfield AI generation. What it does: Reuses the public MIT-licensed higgsfield-ai/skills command intelligence as a thin videoclaw skill. It routes agents to the official higgsfield CLI for Marketing Studio ad videos, product photoshoots, marketplace cards, Soul Character identity training, generic image/video generation, and finished-video virality scoring. Key features:
- Keeps
higgsfieldoptional rather than making it a videoclaw dependency - Uses Higgsfield's dedicated product-photoshoot and Marketing Studio commands instead of generic prompt guessing
- Gives agents a clean hand-off rule: standalone Higgsfield URLs directly, project-tracked outputs through existing videoclaw artifacts
- Captures the upstream source commit and MIT reuse boundary in the skill itself
When to reach for it: "use Higgsfield", "make a UGC ad in Marketing Studio", "product photoshoot", "train Soul ID", "score this video's virality".
Full guide: skills/higgsfield-generate/SKILL.md
youtube-audio
Role: YouTube → MP3 audio or MP4 video using yt-dlp + FFmpeg. What it does: Downloads audio or video from YouTube videos and playlists. Supports trimming, resolution selection, and audio quality settings. Key features:
- Single videos, playlists, or batch URLs
- Audio-only, video-only, or both in one pass
- Trim to clip ranges
- Requires
yt-dlpandffmpeg
When to reach for it: "download audio from this YouTube video", "grab the MP4", "extract music from a playlist".
Full guide: skills/youtube-audio/SKILL.md
ugc
Role: Belief-driven UGC campaign generator using the E5 method. What it does: Generates a multi-video belief-targeted UGC campaign (30–60s each) from a product URL and intent. Produces N videos with subtitles plus a campaign report. Key features:
- Belief-journey decomposition (E5 method: Examine, Educate, Emote, Evidence, Empower)
- Multi-video campaign output rather than single-clip
- Per-video subtitles and aggregate campaign report
- Triggers on "UGC campaign", "belief-driven ads", "E5 method"
When to reach for it: When the goal is a marketing campaign rather than a single creative video.
Full guide: skills/ugc/SKILL.md
🎤 Video aliases
These exist for discoverability and personal/brand handoff — they all delegate into brand-presenter with a different host profile. Treat them as compatibility surfaces; the canonical workflow lives in brand-presenter.
| Alias | Profile | Trigger |
|---|---|---|
davendra-presenter | Davendra (asset/voice) | "davendra video", "davendra presenter" |
nex-presenter | Nex (asset/voice) | "nex video", "nex presenter" |
bunty | Bunty — cricket commentator (orange blazer) | "bunty thing", "match day analysis", "cricket scorecard video" |
⚙️ Workflow skills
Workflow skills are independent of any one production mode. They orchestrate, debug, review, or operate on top of the production layer.
Multi-agent orchestration
worker
Team-worker protocol (ACK, mailbox, task lifecycle) for tmux-based teams. Pairs with team. Full guide
pipeline
Configurable pipeline orchestrator for sequencing stages — useful when you need explicit per-stage control rather than a fully autonomous run. Full guide
studio-mode
Agent-driven video production with interview → consensus plan → user approval → credit spend. Slower-but-safer alternative to the fast one-shot vclaw video create. Triggered by "$studio" requests. Full guide
Diagnostics & exploration
doctor
Diagnose and fix oh-my-codex installation issues. Full guide
build-fix
Fix build and TypeScript errors with minimal changes. Conservative — does not refactor. Full guide
deepsearch
Thorough codebase search. Use when grep/glob isn't enough and you need an agent to actually understand the codebase as it searches. Full guide
deep-interview
Socratic deep interview with mathematical ambiguity gating before execution. Forces requirements clarity before any work begins. Full guide
Review & governance
review
Reviewer-only pass for /plan --review and cleanup artifact review. Full guide
Operational utilities
ai-slop-cleaner
Run an anti-slop cleanup/refactor/deslop workflow. Removes the kinds of half-baked artifacts that unsupervised agents leave behind. Full guide
configure-notifications
Configure OMX notifications — unified entry point for all notification platforms. Full guide
skill
Manage local skills — list, add, remove, search, edit, setup wizard. Full guide
note
Save notes to notepad.md for compaction resilience across long agent runs. Full guide
help
Guide on using VideoClaw — the in-product help surface. Full guide
web-clone
URL-driven website cloning with visual + functional verification. Full guide
Adding or modifying a skill
The skill set is curated, not auto-generated. To add or modify a skill:
- Add or edit
skills/<name>/SKILL.md— the canonical guide for the skill - Update
skills/catalog.jsonwith the skill's id, category, status, and anyspecializes/aliasOf/specializationsrelationships - Add a section here in
docs/SKILLS.mdand link it from the index table - Run
npm run check:skill-frontdoorto verify the repo-local skill front door stays consistent - Run
npm run check:cleanroom-docsto verify clean-room-facing docs and skills don't reference stale paths
See also
README.md— repo front doordocs/CLI_REFERENCE.md— full CLI surfacedocs/OBSIDIAN.md— Obsidian operator workspace deep guidedocs/ARCHITECTURE.md— layer mapskills/catalog.json— machine-readable skill index
