Brand DNA & Brand Extraction

Point videoclaw at any company's website and it reads that site like a brand strategist — pulling the real colors, fonts, voice, and messaging into a single brand-dna.json file that can then seed a whole video brief --project automatically. You go from a URL to a brand-aware video plan --project in two commands.

What it does

Reads a live website. Fetches the page (plain fetch, no headless browser, no extra dependencies) and scrapes the title, meta description, body text, logo images, color values, and font families.
Extracts the real color palette deterministically. It harvests every #hex and rgb() color on the page, throws out white/black/grey neutrals, ranks the rest by frequency, and keeps the top 3 as primary and the next 5 as secondary. Same site in, same palette out — every time.
Understands the brand's voice. One strict-JSON pass through Google Gemini turns the page text into structured fields: brand name, industry, tagline, value proposition, tone of voice, brand personality, target audience, key messages, plus an imageryStyle and layoutStyle chosen from a fixed list.
Captures the logo. Picks the best logo candidate (a logo-tagged image, falling back to the og:image, then the favicon).
Writes one machine-readable artifact — brand-dna.json — that downstream stages can consume.
Seeds your brief. video brief --from-brand-dna reads that artifact and auto-fills the brief's title and intent (from the brand name + value proposition) and parks the full brand profile under metadata.brandDna.
Stays safe. It records logo URL, colors, and text only. Photoreal faces (founder headshots, people in og:image) are never registered or passed downstream — they trip the real-person content filter and don't help anyway.

Why it matters: this is the only AI step in the whole brief → storyboard → story-bible chain. It is quarantined in its own stage with its own artifact, so everything after it stays deterministic and reproducible.

How to use it

All commands use node dist/cli/vclaw.js video .... If you installed videoclaw globally, you can type vclaw video ... instead.

bash

node dist/cli/vclaw.js video init acme-launch

Creates the project workspace at projects/acme-launch/.

bash

export GEMINI_API_KEYS="key1,key2"   # or GOOGLE_API_KEY=...
node dist/cli/vclaw.js video brand-extract --project acme-launch --url https://www.acme.com

Scrapes the site, ranks the palette, runs the Gemini voice pass, and writes projects/acme-launch/artifacts/brand-dna.json. Prints the full artifact as JSON on stdout.

bash

node dist/cli/vclaw.js video brand-extract --project acme-launch --url https://www.acme.com --gemini-endpoint https://my-proxy/v1beta/models/gemini-2.0-flash:generateContent

Same, but routes the AI call through a custom endpoint (also overridable via VCLAW_GEMINI_API_ENDPOINT).

bash

node dist/cli/vclaw.js video brief --project acme-launch --from-brand-dna

Builds the brief straight from the extracted brand DNA — no --title or --intent needed. The brand name becomes the title and the value proposition becomes the intent.

bash

node dist/cli/vclaw.js video brief --project acme-launch --from-brand-dna --title "Acme Spring Launch"

Same, but you override one field by hand. Explicit flags always win; brand DNA only supplies the fallbacks you leave out.

How it flows

Diagram source (live Mermaid)

Artifacts & outputs

video brand-extract writes one artifact:

projects/<slug>/artifacts/brand-dna.json — schema: schemas/video/artifacts/brand-dna.schema.json.

Shape (schemaVersion: 1, kind: "brand-dna"):

brandName, industry, tagline, valueProposition, targetAudience — strings
toneOfVoice, brandPersonality, keyMessages — string arrays (3-5 entries each)
primaryColors (top 3), secondaryColors (next 5), fonts — string arrays
logoUrl — string or null
imageryStyle — one of professional | casual | illustrated | cinematic | minimalist | editorial
layoutStyle — one of modern | classic | minimalist | bold | editorial
sourceUrl, createdAt, projectSlug — provenance

When you run video brief --from-brand-dna, the full brand profile is copied into the brief's metadata.brandDna so later stages (story bible, storyboard) can read it.

Tips & gotchas

Deterministic palette, AI voice

The colors and fonts are computed by code, not guessed by the model — so the palette is stable across runs. Only the voice/messaging fields come from Gemini (low temperature 0.2), and the deterministic palette is merged over any color the model might have hallucinated.

Gemini keys are required

brand-extract needs a Gemini API key. Set GEMINI_API_KEYS (comma-separated for round-robin) or GOOGLE_API_KEY. Without it the run fails with "No Gemini API keys configured". See the Gemini key pool.

URL must be http(s) and reachable

The URL is validated up front (only http:/https: are accepted) and the scrape fetch hard-times-out at 20 seconds so an unattended/overnight run can't hang. A non-200 response fails the command.

Run extract before brief

video brief --from-brand-dna errors if no brand-dna.json exists yet. Run brand-extract for that project first.

No faces leak downstream

By design, founder headshots and people in the page's og:image are never carried forward as reference images — they break the ARK/Seedance real-person filter. Describe characters by visual descriptor instead (see characters).

Driving it from an agent

An AI agent (e.g. Claude Code) runs node dist/cli/vclaw.js video brand-extract --project <slug> --url <site>, parses the JSON printed on stdout, then chains video brief --project <slug> --from-brand-dna to start the pipeline. Both commands exit non-zero with a structured missing_required_flag error (listing the missing flags) when --project/--url are absent or when the brand-dna artifact is missing for --from-brand-dna, so the agent can branch on exit code and re-run with corrected arguments.

Brand DNA & Brand Extraction ​

What it does ​

How to use it ​

How it flows ​

Artifacts & outputs ​

Tips & gotchas ​

Driving it from an agent ​

Related ​

Brand DNA & Brand Extraction

What it does

How to use it

How it flows

Artifacts & outputs

Tips & gotchas

Driving it from an agent

Related