Skill1.3k estrellas del repoactualizado today
whisper
The whisper skill converts audio files in formats including MP3, WAV, M4A, and FLAC into text transcriptions using OpenAI's Whisper model. Use this skill when you need to extract spoken content from audio recordings, supporting over 90 languages with optional timestamp generation and configurable model sizes ranging from tiny to large for different accuracy and speed tradeoffs.
Instalar en Claude Code
Copiargit clone --depth 1 https://github.com/trpc-group/trpc-agent-go /tmp/whisper && cp -r /tmp/whisper/examples/skill/skills/whisper ~/.claude/skills/whisperDespués abre una sesión nueva de Claude Code; el skill carga automáticamente.
Definición
SKILL.md
# Whisper Audio Transcription Skill Transcribe audio files to text using OpenAI Whisper. ## Capabilities - Transcribe audio files (MP3, WAV, M4A, FLAC, OGG, etc.) to text - Support for 90+ languages with auto-detection - Optional timestamp generation - Multiple model sizes (tiny/base/small/medium/large) - Output in plain text or JSON format ## Usage ### Basic Transcription ```bash python3 scripts/transcribe.py <audio_file> <output_file> ``` ### With Options ```bash # Specify model size (default: base) python3 scripts/transcribe.py audio.mp3 transcript.txt --model medium # Specify language (improves accuracy) python3 scripts/transcribe.py audio.mp3 transcript.txt --language zh # Include timestamps python3 scripts/transcribe.py audio.mp3 transcript.txt --timestamps # JSON output with metadata python3 scripts/transcribe.py audio.mp3 output.json --format json ``` ## Parameters - `audio_file` (required): Path to input audio file - `output_file` (required): Path to output text/JSON file - `--model`: Whisper model size (tiny/base/small/medium/large, default: base) - `--language`: Language code (e.g., en, zh, es, fr, auto for detection) - `--timestamps`: Include word-level timestamps in output - `--format`: Output format (text/json, default: text) ## Model Sizes | Model | Parameters | Speed | Accuracy | Memory | |--------|------------|-------|----------|--------| | tiny | 39M | ~32x | Good | ~1GB | | base | 74M | ~16x | Better | ~1GB | | small | 244M | ~6x | Great | ~2GB | | medium | 769M | ~2x | Excellent| ~5GB | | large | 1.5B | 1x | Best | ~10GB | ## Supported Audio Formats MP3, WAV, M4A, FLAC, OGG, AAC, WMA, and more (via FFmpeg) ## Dependencies - Python 3.8+ - openai-whisper - ffmpeg ## Installation ```bash pip install openai-whisper sudo apt-get install ffmpeg # Ubuntu/Debian ```
Del mismo repositorio
blogSkill
enSkill
zhSkill
artifact_demoSkill
Demo skill that writes an output file and persists it as an artifact.
news-query-agentSubagent
Answer news queries with a fixed demo response.
weather-querySkill
Answer weather queries with a fixed demo response.
contact-lookup-agentSubagent
Look up contact phone numbers with fixed demo data.
write-okSkill
Write a deterministic OK file to out/ok.txt.