video-podcast-maker
This skill automates video podcast creation by converting a user-provided topic into a complete 4K video through sequential stages: research and script generation, text-to-speech narration, Remotion video composition, and MP4 export with background music. It also supports extracting and applying visual design patterns from reference videos or images to new compositions through optional style profile management.
git clone --depth 1 https://github.com/dtsola/xiaoyaosearch /tmp/video-podcast-maker && cp -r /tmp/video-podcast-maker/.claude/skills/video-podcast-maker ~/.claude/skills/video-podcast-makerSKILL.md
> **REQUIRED: Load Remotion Best Practices First**
>
> This skill depends on `remotion-best-practices`. **You MUST invoke it before proceeding:**
> ```
> Invoke the skill/tool named: remotion-best-practices
> ```
# Video Podcast Maker
## Quick Start
Open your coding agent and say: **"Make a video podcast about $ARGUMENTS"**
Or invoke directly: `/video-podcast-maker AI Agent tutorial`
---
## Design Learning (Optional)
Extract visual design patterns from reference videos or images and apply them to new video compositions. **Skip this section unless** the user provides a reference video/image or asks to save/list/delete style profiles.
→ See **[references/design-learning.md](references/design-learning.md)** for commands, reference-library management, style-profile management, and integration with Pre-workflow / Step 9.
---
## Auto Update Check
**Agent behavior:** Check for updates at most once per day (throttled by timestamp file).
Before any shell command that reads files from this skill, resolve `SKILL_DIR` to the directory containing `SKILL.md`.
If your agent exposes a built-in skill directory variable such as `${CLAUDE_SKILL_DIR}`, you may map it to `SKILL_DIR`.
```bash
SKILL_DIR="${SKILL_DIR:-${CLAUDE_SKILL_DIR}}"
STAMP="${SKILL_DIR}/.last_update_check"
NOW=$(date +%s)
LAST=$(cat "$STAMP" 2>/dev/null || echo 0)
if [ ! -d "${SKILL_DIR}/.git" ]; then
echo "MANUAL_INSTALL"
elif [ $((NOW - LAST)) -gt 86400 ]; then
timeout 5 git -C "${SKILL_DIR}" fetch --quiet 2>/dev/null || true
LOCAL=$(git -C "${SKILL_DIR}" rev-parse HEAD 2>/dev/null)
REMOTE=$(git -C "${SKILL_DIR}" rev-parse origin/main 2>/dev/null)
echo "$NOW" > "$STAMP"
if [ -n "$LOCAL" ] && [ -n "$REMOTE" ] && [ "$LOCAL" != "$REMOTE" ]; then
echo "UPDATE_AVAILABLE"
else
echo "UP_TO_DATE"
fi
else
echo "SKIPPED_RECENT_CHECK"
fi
```
- **Update available**: Ask the user whether to pull updates. Yes → `git -C "${SKILL_DIR}" pull`. No → continue.
- **Up to date / Skipped**: Continue silently.
- **Manual install** (no `.git` directory — skill was installed via tarball/zip/cp): Continue silently. Auto-update is disabled; the user must reinstall manually to update.
---
## Prerequisites Check
!`python3 "${SKILL_DIR}/scripts/check_prereqs.py"`
**If MISSING reported above**, see README.md for full setup instructions (install commands, API key setup, Remotion project init). The check is backend-aware: backend is resolved as `TTS_BACKEND` env var → `user_prefs.json` (`global.tts.backend`) → `edge` default, then only env vars required by that backend are validated.
---
## Overview
Automated pipeline for professional **Bilibili horizontal knowledge videos** from a topic.
> **Target: Bilibili horizontal video (16:9)**
> - Resolution: 3840×2160 (4K) or 1920×1080 (1080p)
> - Style: Clean white (default)
**Tech stack:** Coding agent + TTS backend + Remotion + FFmpeg
### Output Specs
| Parameter | Horizontal (16:9) | Vertical (9:16) |
|-----------|-------------------|-----------------|
| **Resolution** | 3840×2160 (4K) | 2160×3840 (4K) |
| **Frame rate** | 30 fps | 30 fps |
| **Encoding** | H.264, 16Mbps | H.264, 16Mbps |
| **Audio** | AAC, 192kbps | AAC, 192kbps |
| **Duration** | 1-15 min | 60-90s (highlight) |
---
## Execution Modes
**Agent behavior:** Detect user intent at workflow start:
- "Make a video about..." / no special instructions → **Auto Mode**
- "I want to control each step" / mentions interactive → **Interactive Mode**
- Default: **Auto Mode**
### Auto Mode (Default)
Full pipeline with sensible defaults. **Mandatory stop at Step 9:**
1. **Step 9**: Launch Remotion Studio — user reviews in real-time, requests changes until satisfied
2. **Step 10**: Only triggered when user explicitly says "render 4K" / "render final version"
| Step | Decision | Auto Default |
|------|----------|-------------|
| 3 | Title position | top-center |
| 5 | Media assets | Skip (text-only animations) |
| 7 | Thumbnail method | Remotion-generated (16:9 + 4:3) |
| 9 | Outro animation | Pre-made MP4 (white/black by theme) |
| 9 | Preview method | Remotion Studio (mandatory) |
| 12 | Subtitles | Skip |
| 14 | Cleanup | Auto-clean temp files |
Users can override any default in their initial request:
- "make a video about AI, burn subtitles" → auto + subtitles on
- "use dark theme, AI thumbnails" → auto + dark + imagen
- "need screenshots" → auto + media collection enabled
### Interactive Mode
Prompts at each decision point. Activated by:
- "interactive mode" / "I want to choose each option"
- User explicitly requests control
---
## Technical Rules
Hard constraints for video production. Visual design remains the agent's creative freedom within these rules:
| Rule | Requirement |
|------|-------------|
| **Single Project** | All videos under `videos/{name}/` in user's Remotion project. NEVER create a new project per video. |
| **4K Output** | 3840×2160, use `scale(2)` wrapper over 1920×1080 design space |
| **Content Width** | ≥85% of screen width |
| **Bottom Safe Zone** | Bottom 100px reserved for subtitles |
| **Audio Sync** | All animations driven by `timing.json` timestamps |
| **Thumbnail** | MUST generate 16:9 (1920×1080) AND 4:3 (1200×900). Centered layout, title ≥120px, icons ≥120px, fill most of canvas. See design-guide.md. |
| **Font** | PingFang SC / Noto Sans SC for Chinese text |
| **Studio Before Render** | MUST launch `remotion studio` for user review. NEVER render 4K until user explicitly confirms ("render 4K", "render final"). |
---
## Additional Resources
Load these files on demand — **do NOT load all at once**:
- **[references/workflow-steps.md](references/workflow-steps.md)**: Index of the 15-step workflow split across three phase files. Load at workflow start to locate which phase file to pull:
- **[workflow-script.md](references/workflow-script.md)** — Pre-workflow + Startup + Steps 1-4 (scripting)
- **[workflow-production.md](references/workflow-production.md)** — Steps 5-11Simplifies and refines code for clarity, consistency, and maintainability while preserving all functionality. Focuses on recently modified code unless instructed otherwise.
Use when working with icons in any project. Provides CLI for searching 200+ icon libraries (Iconify) and retrieving SVGs. Commands: `better-icons search <query>` to find icons, `better-icons get <id>` to get SVG. Also available as MCP server for AI agents.
Expert code review of current git changes with a senior engineer lens. Detects SOLID violations, security risks, and proposes actionable improvements.
Use when user requests diagrams, flowcharts, architecture charts, or visualizations. Also use proactively when explaining systems with 3+ components, complex data flows, or relationships that benefit from visual representation. Generates .excalidraw files and exports to PNG/SVG via Kroki API or locally using excalidraw-brute-export-cli.
Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
Best practices for Remotion - Video creation in React
小遥搜索 MCP 工具 - 本地文件智能搜索(语义/全文/图像/语音/混合搜索)