Skill1.6k estrellas del repoactualizado today

whisper

The whisper skill converts audio files in formats including MP3, WAV, M4A, and FLAC into text transcriptions using OpenAI's Whisper model. Use this skill when you need to extract spoken content from audio recordings, supporting over 90 languages with optional timestamp generation and configurable model sizes ranging from tiny to large for different accuracy and speed tradeoffs.

Ver fuente Repositorio: trpc-agent-go

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/trpc-group/trpc-agent-go /tmp/whisper && cp -r /tmp/whisper/examples/skill/skills/whisper ~/.claude/skills/whisper

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

# Whisper Audio Transcription Skill

Transcribe audio files to text using OpenAI Whisper.

## Capabilities

- Transcribe audio files (MP3, WAV, M4A, FLAC, OGG, etc.) to text
- Support for 90+ languages with auto-detection
- Optional timestamp generation
- Multiple model sizes (tiny/base/small/medium/large)
- Output in plain text or JSON format

## Usage

### Basic Transcription

```bash
python3 scripts/transcribe.py <audio_file> <output_file>
```

### With Options

```bash
# Specify model size (default: base)
python3 scripts/transcribe.py audio.mp3 transcript.txt --model medium

# Specify language (improves accuracy)
python3 scripts/transcribe.py audio.mp3 transcript.txt --language zh

# Include timestamps
python3 scripts/transcribe.py audio.mp3 transcript.txt --timestamps

# JSON output with metadata
python3 scripts/transcribe.py audio.mp3 output.json --format json
```

## Parameters

- `audio_file` (required): Path to input audio file
- `output_file` (required): Path to output text/JSON file
- `--model`: Whisper model size (tiny/base/small/medium/large, default: base)
- `--language`: Language code (e.g., en, zh, es, fr, auto for detection)
- `--timestamps`: Include word-level timestamps in output
- `--format`: Output format (text/json, default: text)

## Model Sizes

| Model  | Parameters | Speed | Accuracy | Memory |
|--------|------------|-------|----------|--------|
| tiny   | 39M        | ~32x  | Good     | ~1GB   |
| base   | 74M        | ~16x  | Better   | ~1GB   |
| small  | 244M       | ~6x   | Great    | ~2GB   |
| medium | 769M       | ~2x   | Excellent| ~5GB   |
| large  | 1.5B       | 1x    | Best     | ~10GB  |

## Supported Audio Formats

MP3, WAV, M4A, FLAC, OGG, AAC, WMA, and more (via FFmpeg)

## Dependencies

- Python 3.8+
- openai-whisper
- ffmpeg

## Installation

```bash
pip install openai-whisper
sudo apt-get install ffmpeg  # Ubuntu/Debian
```

Del mismo repositorio

Demo skill that writes an output file and persists it as an artifact.

news-query-agentSubagent

Answer news queries with a fixed demo response.

weather-querySkill

Answer weather queries with a fixed demo response.

contact-lookup-agentSubagent

Look up contact phone numbers with fixed demo data.

write-okSkill

Write a deterministic OK file to out/ok.txt.