Skill122 estrellas del repoactualizado 8d ago

venice-audio-music

The venice-audio-music Claude Code skill manages asynchronous music and long-form audio generation through Venice's API, implementing a four-step lifecycle: quoting price, queueing requests with detailed parameters (prompt, lyrics, duration, voice, language, speed), polling for completion status, and finalizing retrieval. Use this when generating songs, jingles, soundscapes, or extended narration where duration-based pricing requires upfront quotes and processing time exceeds synchronous timeout limits.

Ver fuente Repositorio: skills

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/veniceai/skills /tmp/venice-audio-music && cp -r /tmp/venice-audio-music/skills/venice-audio-music ~/.claude/skills/venice-audio-music

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

# Venice Music / Async Audio

Music (and long-form voice) generation is **asynchronous**. The flow is:

```
POST /api/v1/audio/quote      → price in USD
POST /api/v1/audio/queue      → { queue_id }      (funds reserved)
POST /api/v1/audio/retrieve   → status or binary audio
POST /api/v1/audio/complete   → finalize & delete media
```

For short text-to-speech, use the synchronous [`venice-audio-speech`](../venice-audio-speech/SKILL.md) endpoint instead.

## Use when

- You need songs, jingles, score, soundscape, or long narration.
- The selected model uses **duration-based** or **character-based** pricing and must be priced before submission.
- The expected generation time is long enough (> 20 s) that sync call would time out.

## Lifecycle

### 1. `POST /audio/quote` — price it first

```bash
curl https://api.venice.ai/api/v1/audio/quote \
  -H "Authorization: Bearer $VENICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "elevenlabs-music",
    "duration_seconds": 60
  }'
```

Response: `{"quote": 0.48}` (USD).

| Field | Notes |
|---|---|
| `model` | Required. Music/audio model from `GET /models?type=music`. |
| `duration_seconds` | Integer or numeric string. Only if the model reports duration metadata. |
| `character_count` | Required for models with `pricing.per_thousand_characters` (long narration). |

### 2. `POST /audio/queue` — enqueue

```bash
curl https://api.venice.ai/api/v1/audio/queue \
  -H "Authorization: Bearer $VENICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "elevenlabs-music",
    "prompt": "Uplifting indie-folk acoustic track, 120 BPM, major key.",
    "lyrics_prompt": "Verse 1: Walking through the city lights...\nChorus: We are the dreamers...",
    "duration_seconds": 60,
    "voice": "Aria",
    "language_code": "en",
    "speed": 1.0,
    "force_instrumental": false,
    "lyrics_optimizer": false
  }'
```

Response: `{ "model": "...", "queue_id": "uuid" }`.

| Field | Notes |
|---|---|
| `model` | Required. |
| `prompt` | Required. Describe genre, mood, tempo, instruments. Length caps in `/models`. |
| `lyrics_prompt` | Lyrics. **Required** when `lyrics_required=true`, **rejected** when `supports_lyrics=false`. |
| `duration_seconds` | Integer or string. Model-dependent. |
| `force_instrumental` | Only when `supports_force_instrumental=true`. |
| `lyrics_optimizer` | Auto-generate lyrics from `prompt`. Requires `supports_lyrics_optimizer=true`. `lyrics_prompt` must be empty. |
| `voice` | For voice-enabled models. See `voices` + `default_voice` in `/models`. |
| `language_code` | ISO 639-1. Requires `supports_language_code=true`. |
| `speed` | Requires `supports_speed=true`. Use model's `min_speed`/`max_speed`. |

### 3. `POST /audio/retrieve` — poll status / download

```bash
curl https://api.venice.ai/api/v1/audio/retrieve \
  -H "Authorization: Bearer $VENICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"elevenlabs-music","queue_id":"..."}' \
  --output track.mp3
```

- If still processing: JSON `{"status":"PROCESSING","average_execution_time":...,"execution_duration":...}`.
- If done: binary audio body (`audio/mpeg` or similar). Save the bytes.
- Set `delete_media_on_completion: true` to skip step 4.

Poll every 2–5 s; use `average_execution_time` (ms, P80) as a guideline for your first poll delay.

### 4. `POST /audio/complete` — cleanup

```bash
curl https://api.venice.ai/api/v1/audio/complete \
  -H "Authorization: Bearer $VENICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"elevenlabs-music","queue_id":"..."}'
```

Removes the media from Venice storage after you've downloaded it. Required unless you used `delete_media_on_completion: true` on retrieve.

## Full loop (TypeScript)

```ts
const base = 'https://api.venice.ai/api/v1'
const headers = {
  Authorization: `Bearer ${process.env.VENICE_API_KEY}`,
  'Content-Type': 'application/json',
}

async function generateTrack() {
  // 1. Quote
  const quote = await fetch(`${base}/audio/quote`, {
    method: 'POST', headers,
    body: JSON.stringify({ model: 'elevenlabs-music', duration_seconds: 60 }),
  }).then(r => r.json())
  console.log('price:', quote.quote)

  // 2. Queue
  const { queue_id, model } = await fetch(`${base}/audio/queue`, {
    method: 'POST', headers,
    body: JSON.stringify({
      model: 'elevenlabs-music',
      prompt: 'Uplifting indie-folk acoustic track, 120 BPM.',
      duration_seconds: 60,
      force_instrumental: true,
    }),
  }).then(r => r.json())

  // 3. Poll
  while (true) {
    const res = await fetch(`${base}/audio/retrieve`, {
      method: 'POST', headers,
      body: JSON.stringify({ model, queue_id }),
    })
    const ct = res.headers.get('content-type') ?? ''
    if (ct.startsWith('audio/')) {
      const buf = Buffer.from(await res.arrayBuffer())
      await fs.writeFile('track.mp3', buf)
      break
    }
    const { status } = await res.json()
    if (status !== 'PROCESSING') throw new Error(`unexpected ${status}`)
    await new Promise(r => setTimeout(r, 3000))
  }

  // 4. Complete
  await fetch(`${base}/audio/complete`, {
    method: 'POST', headers,
    body: JSON.stringify({ model, queue_id }),
  })
}
```

## Capability probing

Before calling `/audio/queue`, inspect the model entry returned by `GET /models?type=music` — each row's `model_spec` exposes (among other fields):

- `supports_lyrics`, `lyrics_required`, `supports_lyrics_optimizer`
- `supports_force_instrumental`, `supports_speed`, `supports_language_code`
- `voices[]`, `default_voice`
- `min_prompt_length`, `prompt_character_limit`
- `min_speed`, `max_speed`
- `pricing.generation` (per-job), `pricing.per_second` (per second generated), `pricing.per_thousand_characters` (character-priced narration), or `pricing.durations` (duration-tiered map: `{ "<tier>": { usd, diem, min_seconds, max_seconds } }`) — each model uses one of these shapes

## Errors

| Code | Meaning |
|---|---|
| `400` | Wrong params (ly

Del mismo repositorio

venice-api-keysSkill

Manage Venice API keys. Covers GET/POST/PATCH/DELETE /api_keys, GET /api_keys/{id}, GET /api_keys/rate_limits, GET /api_keys/rate_limits/log, the two-step /api_keys/generate_web3_key wallet flow, INFERENCE vs ADMIN key types, and per-key consumption limits (USD / DIEM).

venice-api-overviewSkill

High-level map of the Venice.ai API - base URL, authentication modes, endpoint categories, response headers, pricing model, error shape, and versioning. Load this first when starting any Venice integration.

venice-audio-speechSkill

Generate speech from text via POST /audio/speech. Covers TTS models (Kokoro, Qwen 3, xAI, Inworld, Chatterbox, Orpheus, ElevenLabs Turbo, MiniMax, Gemini Flash), voices per family, output formats (mp3/opus/aac/flac/wav/pcm), streaming, prompt/emotion styling, temperature/top_p, and language hints.

venice-audio-transcriptionSkill

Transcribe audio files to text via POST /audio/transcriptions. Covers supported models (Parakeet, Whisper, Wizper, Scribe, xAI STT), supported formats (wav/flac/m4a/aac/mp4/mp3/ogg/webm), response formats (json/text), timestamps, and language hints. OpenAI-compatible multipart.

venice-augmentSkill

Venice augmentation endpoints for agent pipelines. Covers POST /augment/text-parser (extract text from PDF/DOCX/XLSX/plain text, multipart, up to 25MB, JSON or plain text response), POST /augment/scrape (fetch a URL and return markdown; blocks X/Reddit), and POST /augment/search (Brave ZDR or anonymized Google; structured title/url/content/date results, up to 20 per query). Privacy (zero data retention), rate limits, and error shapes.

venice-authSkill

Authenticate to the Venice API with a Bearer API key or with an x402 / SIWE wallet. Covers header formats, the SIWE message fields, TTL and nonce rules, the venice-x402-client SDK, and how to choose between the two modes.

venice-billingSkill

Venice billing and usage analytics - GET /billing/balance, GET /billing/usage (paginated per-request ledger, JSON or CSV), and GET /billing/usage-analytics (aggregated by date/model/key). Covers the DIEM/USD/BUNDLED_CREDITS consumption priority and building dashboards. (Beta)

venice-charactersSkill

Discover and use Venice public characters (persona-driven system prompts with a bound model). Covers GET /characters (search/filter/sort), /characters/{slug}, /characters/{slug}/reviews, the Character schema, and how to apply a character via venice_parameters.character_slug in chat completions.