Skip to main content
ClaudeWave
Skill1.3k estrellas del repoactualizado today

minutes-video-review

The minutes-video-review skill processes recorded product walkthroughs, bug reports, and screencasts from Loom or ScreenPal into actionable artifact bundles containing transcripts, sampled key frames, and visual analysis. Use this skill when a user provides a recorded demo or bug reproduction video that needs to be converted into durable documentation with extracted audio, transcription, frame sampling, and a structured brief for engineering or product team follow-up work.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/silverstein/minutes /tmp/minutes-video-review && cp -r /tmp/minutes-video-review/.opencode/skills/minutes-video-review ~/.claude/skills/minutes-video-review
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

## Skill Path

Before running helper scripts or opening bundled references, set:

```bash
export MINUTES_SKILLS_ROOT="$(git rev-parse --show-toplevel)/.opencode/skills"
export MINUTES_SKILL_ROOT="$MINUTES_SKILLS_ROOT/minutes-video-review"
```

# /minutes-video-review

Analyze a product walkthrough, bug report video, Loom, ScreenPal, or local recording into a durable artifact bundle that agents can keep working from.

This skill is for **meeting-adjacent product artifacts**, not for generic "understand any video" requests. Use it when the user wants a recorded demo, bug repro, or walkthrough turned into something actionable for engineering, product, support, or follow-up agent work.

## What this skill does

The bundled script handles the deterministic pipeline:

- resolve a local file or hosted video URL
- download hosted video when needed
- extract audio with `ffmpeg`
- transcribe with Minutes first, using the user's existing Minutes transcription setup
- sample key frames with adaptive caps so long videos do not blow up context
- write a durable artifact bundle under `~/.minutes/video-reviews/`

Then **you** review the resulting artifacts and return the actual user-facing brief.

## Primary command

Local file:

```bash
python3 "$MINUTES_SKILL_ROOT/scripts/video_review.py" \
  "/absolute/path/to/video.mp4"
```

Hosted video:

```bash
python3 "$MINUTES_SKILL_ROOT/scripts/video_review.py" \
  "https://go.screenpal.com/watch/..."
```

Useful options:

```bash
python3 "$MINUTES_SKILL_ROOT/scripts/video_review.py" \
  "https://www.loom.com/share/..." \
  --focus "customer signup bug repro" \
  --cookies-from-browser chrome \
  --env-file /absolute/path/to/.env \
  --frame-step 15 \
  --max-frames 36 \
  --keep-temp
```

## How to use it

### Phase 1: Run the pipeline

Run the script on the provided local file or hosted video URL.

The script prints JSON with the output artifact paths. Important outputs include:

- `analysis_md`
- `analysis_json`
- `transcript_md`
- `metadata_json`
- `frames_dir`
- `contact_sheet_artifact`

### Phase 2: Inspect the artifacts

Read the generated `analysis.md` and `analysis.json` first.

Then inspect:

- `transcript.md` for the actual spoken content
- selected images from `frames/` when visual state matters
- `contact-sheet.jpg` for a quick visual sweep across sampled frames
- `metadata.json` for transcript method, duration, source kind, and frame sampling details

### Phase 3: Produce the real brief

Return a concise, useful brief to the user that includes:

- what the video is trying to show
- likely bug / proposal / walkthrough intent
- key moments or timestamps
- likely impacted area or flow
- the clearest next actions

Do not just echo the generated markdown blindly. Use the artifacts as evidence and produce a thoughtful agent answer.

## Minutes-first transcription rules

This skill should prefer transcript backends in this order:

1. hosted captions / VTT when the source exposes them
2. `minutes process` with an isolated temporary config
3. local `whisper` CLI if available
4. OpenAI audio transcription only as a last resort when configured

Important:

- the Minutes path should use the user's current Minutes transcription setup
- if Minutes is configured for Whisper, use Whisper
- if Minutes is configured for Parakeet, use Parakeet
- do not silently fork a separate transcription stack unless the Minutes path is unavailable

When reporting the artifacts back to the user, preserve the transcript method exactly. Prefer labels like:

- `vtt_captions`
- `minutes-whisper`
- `minutes-parakeet`
- `minutes-whisper-fallback`
- `local_whisper_cli`
- `openai_audio_transcription`

## Context discipline

This skill must stay disciplined about context size.

- Do not send the full video itself to the reasoning layer.
- Do not dump a long transcript and dozens of frames into the final answer.
- Treat the transcript as the backbone and frames as supporting evidence.
- Prefer inspecting a curated subset of frames instead of every sampled image.

The bundled script already caps frames adaptively, but you should still exercise judgment when deciding what to read or mention.

## Output contract

The script writes a durable bundle under:

```bash
~/.minutes/video-reviews/<timestamp>-<slug>/
```

Expected files:

- `analysis.md`
- `analysis.json`
- `transcript.md`
- `metadata.json`
- `frames/`

These artifacts are **not** part of the normal `~/meetings/` corpus by default.

## Dependencies

See:

- `$MINUTES_SKILL_ROOT/references/dependencies.md`
- `$MINUTES_SKILL_ROOT/references/output-schema.md`

## Gotchas

- **Hosted URLs need `yt-dlp`.** Local file review still works without it.
- **Frame caps are intentional.** The script samples enough evidence to review the video without turning this into a generic video-intelligence pipeline.
- **Minutes artifacts stay isolated.** The script uses a temp config/output path for the Minutes transcription run so it does not pollute the user's normal archive.
- **Model-powered auto-analysis is optional.** The generated `analysis.md/json` may be heuristic when no multimodal provider key is available. You still need to read the artifacts and produce the final answer.
- **Long videos need synthesis, not brute force.** If the transcript is long, work from the generated artifacts and only open the most relevant frames and transcript sections.
minutes-briefSkill

Fast non-interactive briefing before any meeting — auto-detects your next calendar event, pulls relationship history, surfaces open commitments, and produces a one-page brief in under 30 seconds. Use this whenever the user says "brief me", "give me a quick brief", "what's coming up", "background on my next call", "who am I meeting next", "brief me on Sarah", "I have a call in 10 min", "quick rundown", or right before walking into a meeting. Different from /minutes-prep — brief is the fast hook-fireable version that doesn't ask questions and doesn't set goals. Use brief when speed matters; use prep when the user wants to think hard about goals first.

minutes-cleanupSkill

Manage old recordings — find large files, archive old meetings, delete processed originals. Use when the user says "clean up recordings", "how much space are meetings using", "delete old recordings", "archive meetings", "manage meeting storage", or asks about disk space from minutes.

minutes-debriefSkill

Post-meeting debrief — analyzes what happened, compares outcomes to your prep intentions, tracks decision evolution. Use when the user says "debrief", "what just happened in that meeting", "what did we decide", "debrief that call", "post-meeting", "what changed", or right after stopping a recording.

minutes-graphSkill

Cross-meeting entity graph — query who/what/when across all your meetings as structured data, with co-occurrence and cross-entity queries that text search can't answer. Use whenever the user says "show me everyone who mentioned X", "all mentions of Y across meetings", "who knows about Z", "graph", "across all meetings", "entity search", "first time we talked about", "trend for X over time", "who's been mentioned alongside", or wants to query meetings as an index rather than full-text search. Builds a JSON entity index on first run (one-time slow), then answers queries instantly. Surface this skill for relationship intelligence, due diligence, or any "across all my history" question that text search alone can't answer.

minutes-ideasSkill

Surface recent voice memos and ideas captured from any device. Use when the user asks "what ideas did I have?", "what were my recent memos?", "what did I record while walking?", or wants to recall a captured thought.

minutes-ingestSkill

Extract facts from meetings and update your knowledge base — person profiles, chronological log, and index. Use when the user asks "ingest my meetings", "update my knowledge base", "extract facts from meetings", "sync meetings to wiki", "backfill knowledge", or wants their PARA/Obsidian/wiki profiles updated from conversation data.

minutes-lintSkill

Health-check your meeting knowledge for contradictions, stale commitments, and decision conflicts. Use when the user asks "any conflicts in my meetings", "check for stale action items", "lint my meetings", "consistency check", "are there contradictions", or wants to audit their decision history.

minutes-listSkill

List recent meetings and voice memos. Use when the user asks "what meetings did I have", "show my recent recordings", "any meetings today", "list my voice memos", or wants an overview of their meeting history. Also use when they need to find a specific meeting by browsing rather than searching.