Skip to main content
ClaudeWave
Skill654 estrellas del repoactualizado today

transcribe

The transcribe skill converts audio and video files to text using the active speech-to-text provider configured in Settings, supporting OpenAI Whisper, Deepgram, and Google Gemini. Use this skill when you need to extract spoken content from media files, with automatic audio extraction from video, chunking for large files, and support for common formats like mp4, mp3, wav, and m4a.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/vellum-ai/vellum-assistant /tmp/transcribe && cp -r /tmp/transcribe/assistant/src/config/bundled-skills/transcribe ~/.claude/skills/transcribe
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

Transcribe audio and video files using the configured speech-to-text provider. Supports multiple STT providers including OpenAI Whisper, Deepgram, and Google Gemini — the active provider is selected in Settings under Speech-to-Text (`services.stt`).

## Usage Notes

- The tool accepts a `file_path` (absolute path to a local audio or video file) to transcribe.
- Supported formats: any video (mp4, mov, etc.) or audio (mp3, wav, m4a, etc.) file.
- For video files, audio is automatically extracted via ffmpeg before transcription.
- Large files are automatically split into chunks for processing.
- If no STT provider credentials are configured, the tool will return an error with setup instructions.
- The STT provider (`services.stt`) is shared between transcription and telephony call paths.

## Maintenance

When adding or modifying an STT provider, follow the onboarding checklist at `assistant/docs/stt-provider-onboarding.md`. That document covers the daemon catalog, config schema, adapter wiring, client catalog parity, and required tests.