Skill1k repo starsupdated 2mo ago

video-podcast-maker

This skill automates video podcast creation by converting a user-provided topic into a complete 4K video through sequential stages: research and script generation, text-to-speech narration, Remotion video composition, and MP4 export with background music. It also supports extracting and applying visual design patterns from reference videos or images to new compositions through optional style profile management.

View source Repository: xiaoyaosearch

Install in Claude Code

Copy

git clone --depth 1 https://github.com/dtsola/xiaoyaosearch /tmp/video-podcast-maker && cp -r /tmp/video-podcast-maker/.claude/skills/video-podcast-maker ~/.claude/skills/video-podcast-maker

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

> **REQUIRED: Load Remotion Best Practices First**
>
> This skill depends on `remotion-best-practices`. **You MUST invoke it before proceeding:**
> ```
> Invoke the skill/tool named: remotion-best-practices
> ```

# Video Podcast Maker

## Quick Start

Open your coding agent and say: **"Make a video podcast about $ARGUMENTS"**

Or invoke directly: `/video-podcast-maker AI Agent tutorial`

---

## Design Learning (Optional)

Extract visual design patterns from reference videos or images and apply them to new video compositions. **Skip this section unless** the user provides a reference video/image or asks to save/list/delete style profiles.

→ See **[references/design-learning.md](references/design-learning.md)** for commands, reference-library management, style-profile management, and integration with Pre-workflow / Step 9.

---

## Auto Update Check

**Agent behavior:** Check for updates at most once per day (throttled by timestamp file).
Before any shell command that reads files from this skill, resolve `SKILL_DIR` to the directory containing `SKILL.md`.
If your agent exposes a built-in skill directory variable such as `${CLAUDE_SKILL_DIR}`, you may map it to `SKILL_DIR`.

```bash
SKILL_DIR="${SKILL_DIR:-${CLAUDE_SKILL_DIR}}"
STAMP="${SKILL_DIR}/.last_update_check"
NOW=$(date +%s)
LAST=$(cat "$STAMP" 2>/dev/null || echo 0)
if [ ! -d "${SKILL_DIR}/.git" ]; then
  echo "MANUAL_INSTALL"
elif [ $((NOW - LAST)) -gt 86400 ]; then
  timeout 5 git -C "${SKILL_DIR}" fetch --quiet 2>/dev/null || true
  LOCAL=$(git -C "${SKILL_DIR}" rev-parse HEAD 2>/dev/null)
  REMOTE=$(git -C "${SKILL_DIR}" rev-parse origin/main 2>/dev/null)
  echo "$NOW" > "$STAMP"
  if [ -n "$LOCAL" ] && [ -n "$REMOTE" ] && [ "$LOCAL" != "$REMOTE" ]; then
    echo "UPDATE_AVAILABLE"
  else
    echo "UP_TO_DATE"
  fi
else
  echo "SKIPPED_RECENT_CHECK"
fi
```

- **Update available**: Ask the user whether to pull updates. Yes → `git -C "${SKILL_DIR}" pull`. No → continue.
- **Up to date / Skipped**: Continue silently.
- **Manual install** (no `.git` directory — skill was installed via tarball/zip/cp): Continue silently. Auto-update is disabled; the user must reinstall manually to update.

---

## Prerequisites Check

!`python3 "${SKILL_DIR}/scripts/check_prereqs.py"`

**If MISSING reported above**, see README.md for full setup instructions (install commands, API key setup, Remotion project init). The check is backend-aware: backend is resolved as `TTS_BACKEND` env var → `user_prefs.json` (`global.tts.backend`) → `edge` default, then only env vars required by that backend are validated.

---

## Overview

Automated pipeline for professional **Bilibili horizontal knowledge videos** from a topic.

> **Target: Bilibili horizontal video (16:9)**
> - Resolution: 3840×2160 (4K) or 1920×1080 (1080p)
> - Style: Clean white (default)

**Tech stack:** Coding agent + TTS backend + Remotion + FFmpeg

### Output Specs

| Parameter | Horizontal (16:9) | Vertical (9:16) |
|-----------|-------------------|-----------------|
| **Resolution** | 3840×2160 (4K) | 2160×3840 (4K) |
| **Frame rate** | 30 fps | 30 fps |
| **Encoding** | H.264, 16Mbps | H.264, 16Mbps |
| **Audio** | AAC, 192kbps | AAC, 192kbps |
| **Duration** | 1-15 min | 60-90s (highlight) |

---

## Execution Modes

**Agent behavior:** Detect user intent at workflow start:

- "Make a video about..." / no special instructions → **Auto Mode**
- "I want to control each step" / mentions interactive → **Interactive Mode**
- Default: **Auto Mode**

### Auto Mode (Default)

Full pipeline with sensible defaults. **Mandatory stop at Step 9:**

1. **Step 9**: Launch Remotion Studio — user reviews in real-time, requests changes until satisfied
2. **Step 10**: Only triggered when user explicitly says "render 4K" / "render final version"

| Step | Decision | Auto Default |
|------|----------|-------------|
| 3 | Title position | top-center |
| 5 | Media assets | Skip (text-only animations) |
| 7 | Thumbnail method | Remotion-generated (16:9 + 4:3) |
| 9 | Outro animation | Pre-made MP4 (white/black by theme) |
| 9 | Preview method | Remotion Studio (mandatory) |
| 12 | Subtitles | Skip |
| 14 | Cleanup | Auto-clean temp files |

Users can override any default in their initial request:
- "make a video about AI, burn subtitles" → auto + subtitles on
- "use dark theme, AI thumbnails" → auto + dark + imagen
- "need screenshots" → auto + media collection enabled

### Interactive Mode

Prompts at each decision point. Activated by:
- "interactive mode" / "I want to choose each option"
- User explicitly requests control

---

## Technical Rules

Hard constraints for video production. Visual design remains the agent's creative freedom within these rules:

| Rule | Requirement |
|------|-------------|
| **Single Project** | All videos under `videos/{name}/` in user's Remotion project. NEVER create a new project per video. |
| **4K Output** | 3840×2160, use `scale(2)` wrapper over 1920×1080 design space |
| **Content Width** | ≥85% of screen width |
| **Bottom Safe Zone** | Bottom 100px reserved for subtitles |
| **Audio Sync** | All animations driven by `timing.json` timestamps |
| **Thumbnail** | MUST generate 16:9 (1920×1080) AND 4:3 (1200×900). Centered layout, title ≥120px, icons ≥120px, fill most of canvas. See design-guide.md. |
| **Font** | PingFang SC / Noto Sans SC for Chinese text |
| **Studio Before Render** | MUST launch `remotion studio` for user review. NEVER render 4K until user explicitly confirms ("render 4K", "render final"). |

---

## Additional Resources

Load these files on demand — **do NOT load all at once**:

- **[references/workflow-steps.md](references/workflow-steps.md)**: Index of the 15-step workflow split across three phase files. Load at workflow start to locate which phase file to pull:
  - **[workflow-script.md](references/workflow-script.md)** — Pre-workflow + Startup + Steps 1-4 (scripting)
  - **[workflow-production.md](references/workflow-production.md)** — Steps 5-11