Skill5.5k repo starsupdated 2d ago

openai-whisper-api

This Claude Code skill transcribes audio files using OpenAI's Whisper API via curl commands. Use it when you need to convert speech in audio files (m4a, ogg, and other formats) into text transcripts, with optional language hints or speaker name prompts to improve accuracy.

View source Repository: openagent

Install in Claude Code

Copy

git clone --depth 1 https://github.com/the-open-agent/openagent /tmp/openai-whisper-api && cp -r /tmp/openai-whisper-api/skills/openai-whisper-api ~/.claude/skills/openai-whisper-api

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# OpenAI Whisper API (curl)

Transcribe an audio file via OpenAI's `/v1/audio/transcriptions` endpoint. Set `OPENAI_BASE_URL` to use an OpenAI-compatible proxy or local gateway.

## Quick start

```bash
curl -sS "https://api.openai.com/v1/audio/transcriptions" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -F "file=@/path/to/audio.m4a" \
  -F "model=whisper-1" \
  -F "response_format=text" \
  > transcript.txt
```

Defaults:

- Model: `whisper-1`
- Output format: `text`

## Options

```bash
# With language hint
curl -sS "https://api.openai.com/v1/audio/transcriptions" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -F "file=@audio.ogg" \
  -F "model=whisper-1" \
  -F "response_format=text" \
  -F "language=en" \
  > transcript.txt

# With speaker hint (prompt)
curl -sS "https://api.openai.com/v1/audio/transcriptions" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -F "file=@audio.m4a" \
  -F "model=whisper-1" \
  -F "response_format=text" \
  -F "prompt=Speaker names: Peter, Daniel" \
  > transcript.txt

# JSON output
curl -sS "https://api.openai.com/v1/audio/transcriptions" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -F "file=@audio.m4a" \
  -F "model=whisper-1" \
  -F "response_format=json" \
  > transcript.json
```

## Custom base URL

Set `OPENAI_BASE_URL` to use an OpenAI-compatible proxy or local gateway:

```bash
API_BASE="${OPENAI_BASE_URL:-https://api.openai.com/v1}"
curl -sS "${API_BASE}/audio/transcriptions" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -F "file=@audio.m4a" \
  -F "model=whisper-1" \
  -F "response_format=text" \
  > transcript.txt
```

## API key

Set `OPENAI_API_KEY` environment variable before running commands.