Skip to main content
ClaudeWave
Skill1.1k repo starsupdated today

oma-video

oma-video generates finished MP4 videos across three modes: short-form vertical content (9:16), horizontal explainers from documentation or code (16:9), and demo walkthroughs via screen capture or supervised web-app recording. It composes scripts, narration, visuals, captions, and Remotion rendering into reproducible directories with deterministic asset pipelines. Use when converting topics, READMEs, or data into narrated video, capturing product demos or onboarding flows, or creating shorts and reels with customizable voice, music, and visual styles.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/first-fluke/oh-my-agent /tmp/oma-video && cp -r /tmp/oma-video/.agents/skills/oma-video ~/.claude/skills/oma-video
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Video Agent - Short-form, Explainer & Demo Router

## Scheduling

### Goal
Generate finished `.mp4` videos through a key-optional, 3-tier (CLI-first / MCP / guided) provider router while preserving deterministic asset buses (script -> timing -> render-spec), reproducible manifests, cost controls, and capture-path safety.

### Intent signature
- User asks for a short-form video, shorts/reels clip, TikTok/YouTube Short, explainer, demo, walkthrough, or screencast.
- User wants a topic, README, code, or data turned into a narrated, captioned video.
- Another skill needs shared video-generation infrastructure (script -> assets -> render).

### When to use

- Generating short-form video (shorts / reels) from a topic or brief (`--mode shorts`, 9:16)
- Generating an explainer from a README, code, or data set (`--mode explainer`, 16:9 / 9:16)
- Producing a demo / walkthrough from a screen capture file (`--mode demo --source file`, 16:9)
- Supervised headed web-app capture of any URL (`--mode demo --source web --url <url>`) — a human drives the on-screen flow; the tool only opens a headed browser and records. Example categories are equal and illustrative only: demo, walkthrough, onboarding clip, bug repro, app-review screencast.
- Re-rendering an existing run deterministically from `render-spec.json`
- Other skills needing video-generation infrastructure (shared invocation via `--format json`)

### When NOT to use

- Generating a single still image -> use `oma-image`
- Generating a slide deck / presentation -> use `oma-slide` (this skill calls it internally for explainer frames)
- Generating speech audio only (no video) -> use `oma-voice`
- Non-linear video editing of an existing finished mp4 -> out of scope (OpenCut-MCP deferred)
- Supervised headed web capture is in-scope (`--source web`); live streaming is out of scope

### Expected inputs
- A brief (topic / README path / data) plus optional mode, aspect, locale, captions, visual, voice, music, duration, compositor, capture path, seed
- For `demo` `--source file`: a screen-capture file path (`--capture`) or Cap availability
- For `demo` `--source web`: a target `--url` (any URL — local/staging/prod), optional `--device`/`--ready-selector`/`--show-cursor`/`--polish`/`--capture-timeout`; capture size is derived from `--aspect`/`--device` (no hardcoded size); a resolvable Playwright + an interactive TTY (else the run falls back to the guided protocol)
- Authentication/environment state for oma-voice (Voicebox MCP), oma-image vendors, and optional Pexels / Pixelle keys

### Expected outputs
- A run directory under `.agents/results/videos/<timestamp>-<shortid>-<mode>/`
- Deterministic asset bus: `script.json`, `timing.json`, `render-spec.json`
- `audio/`, `visuals/`, `captions.srt` / `captions.vtt`, the rendered `<mode>-<slug>.mp4`
- `manifest.json` with providers, asset hashes, cost breakdown, and exit code

### Dependencies
- `oma video generate` CLI + central error module (exit codes aligned with `oma search fetch`)
- oma-voice (Voicebox MCP), oma-image, oma-slide as key-free fallback providers
- Vendored Remotion project at `resources/remotion/` (compositor)
- `resources/vendor-matrix.md`, `resources/execution-protocol.md`, `resources/prompt-tips.md`, `config/video-config.yaml`

### Control-flow features
- Branches by mode (shorts / explainer / demo), aspect, visual strategy, provider availability, cost threshold, capture requirement, and path safety
- Runs a per-capability fallback chain (real key/resource -> key-free fallback) per backend rule 11
- Reads briefs/captures and writes assets, render-spec, and manifests
- Calls external resources: Voicebox MCP, oma-image vendors, Remotion toolchain, optional Pexels / Pixelle

## Structural Flow

### Entry
1. Validate that the brief carries enough mode/topic signal (or infer the mode from keywords).
2. For `demo`, confirm a capture path exists (or Cap is available); otherwise enter the guided protocol.
3. Resolve defaults from `config/video-config.yaml` -> env vars -> CLI flags; check output path safety and limits.

### Scenes
1. **PREPARE**: Resolve mode/aspect/locale, clarify or amplify the brief, choose the visual + compositor strategy.
2. **ACQUIRE**: Probe provider availability (voice / visual / caption / compositor), validate capture path, check cost.
3. **ACT**: Run the mode pipeline — script -> (voice ∥ visuals ∥ captions) -> render-spec -> compositor render.
4. **VERIFY**: Validate every asset-bus schema, manifest hashes, exit code, and the output mp4.
5. **FINALIZE**: Return the run-dir path, the mp4 path, and any provider/coverage warnings.

### Transitions
- If the brief lacks a clear mode, infer from keywords (shorts/reels -> shorts; README/code -> explainer; capture -> demo) and show the user the inferred plan before generating.
- If the selected visual provider key is absent (Pexels / Pixelle), fall through the chain to the key-free oma-image stills + Ken Burns fallback and annotate coverage.
- If `demo` `--source web` has a `--url`, dispatch the headed web-capture path (human-driven flow, ENTER to stop); if Playwright is unresolvable OR there is no interactive TTY, fall back to the guided protocol (no hang).
- If `demo` `--source file` has no capture and Cap is unavailable, emit the guided capture protocol and stop (exit code maps to capture-required).
- If estimated cost (Pixelle / RunningHub credits) exceeds the guardrail, require confirmation unless bypassed.

### Failure and recovery
- If a provider is unavailable, try the next provider in the capability's `order`; only chain exhaustion is a stage failure.
- If the Remotion toolchain is not bootstrapped, point the user to `oma video doctor`; fall back to the MPT compositor where applicable.
- If Voicebox MCP is down, fall back through voicebox-stt -> whisper.cpp -> estimated timing (still produces captions).
- If the brief locale is non-source, translate via oma-translator (key-free); if absent, warn and keep source text.

##
oma-academic-writerSkill

>

oma-architectureSkill

Architecture specialist for software/system design, module and service boundaries, tradeoff analysis, and stakeholder synthesis. Uses context-aware methods such as diagnostic routing, design-twice comparison, ATAM-style risk analysis, CBAM-style prioritization, and ADR-style decision records.

oma-backendSkill

Backend specialist for APIs, databases, authentication with clean architecture (Repository/Service/Router pattern). Use for API, endpoint, REST, database, server, migration, and auth work.

oma-brainstormSkill

Design-first ideation that explores user intent, constraints, and approaches before any planning or implementation. Use for brainstorming, ideation, exploring concepts, and evaluating approaches.

oma-coordinationSkill

Guide for coordinating PM, Frontend, Backend, Mobile, and QA agents on complex projects via CLI. Use for manual step-by-step coordination and workflow guidance.

oma-dbSkill

Database specialist for SQL, NoSQL, and vector database modeling, schema design, normalization, indexing, transactions, integrity, concurrency control, backup, capacity planning, data standards, anti-pattern review, and compliance-aware database design. Use for database, schema, ERD, table design, document model, vector index design, RAG retrieval architecture, migration, query tuning, glossary, capacity estimation, backup strategy, database anti-pattern remediation work, and ISO 27001, ISO 27002, or ISO 22301-aware database recommendations.

oma-debugSkill

Bug diagnosis and fixing specialist - analyzes errors, identifies root causes, provides fixes, and writes regression tests. Use for bug, debug, error, crash, traceback, exception, and regression work.

oma-deepsecSkill

>