Skip to main content
ClaudeWave
Skill92 repo starsupdated 1mo ago

venice-api-overview

High-level map of the Venice.ai API - base URL, authentication modes, endpoint categories, response headers, pricing model, error shape, and versioning. Load this first when starting any Venice integration.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/veniceai/skills /tmp/venice-api-overview && cp -r /tmp/venice-api-overview/skills/venice-api-overview ~/.claude/skills/venice-api-overview
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Venice API Overview

Venice.ai is an OpenAI-compatible inference platform for text, image, audio, video, and embeddings. One API — two ways to pay: a traditional **API key** (Pro account), or a **wallet** (x402, USDC on Base, no account required).

## Use when

- You're writing code against `api.venice.ai` for the first time.
- You need to decide between API-key and x402/wallet authentication.
- You want a quick map of which endpoint to call for which task.
- You need to understand the common response headers (`X-Balance-Remaining`, `PAYMENT-REQUIRED`, etc.).

## Base URL

All endpoints live under:

```
https://api.venice.ai/api/v1
```

The OpenAPI spec is distributed at `outerface/swagger.yaml` (current version `20260420.235001`).

## Authentication (pick one per request)

| Scheme | Header | Best for |
|---|---|---|
| `BearerAuth` | `Authorization: Bearer <VENICE_API_KEY>` | Server-side apps, dashboards, usage analytics, bundled credits |
| `siwx` (x402) | `X-Sign-In-With-X: <base64 SIWE JSON>` | No account, pay-as-you-go with USDC on Base, serverless / agents |

Every inference endpoint accepts **either** — see [`venice-auth`](../venice-auth/SKILL.md).

```bash
# Bearer
curl https://api.venice.ai/api/v1/models \
  -H "Authorization: Bearer $VENICE_API_KEY"

# x402 / SIWE (one-liner via the SDK)
import { VeniceClient } from 'venice-x402-client'
const v = new VeniceClient(process.env.WALLET_KEY)
await v.models.list()
```

## Endpoint map

### Inference

| Category | Endpoints | Skill |
|---|---|---|
| Chat | `POST /chat/completions` | [`venice-chat`](../venice-chat/SKILL.md) |
| Responses (Alpha) | `POST /responses` | [`venice-responses`](../venice-responses/SKILL.md) |
| Embeddings | `POST /embeddings` | [`venice-embeddings`](../venice-embeddings/SKILL.md) |
| Image gen | `POST /image/generate`, `POST /images/generations`, `GET /image/styles` | [`venice-image-generate`](../venice-image-generate/SKILL.md) |
| Image edit | `POST /image/edit`, `POST /image/multi-edit`, `POST /image/upscale`, `POST /image/background-remove` | [`venice-image-edit`](../venice-image-edit/SKILL.md) |
| TTS | `POST /audio/speech` | [`venice-audio-speech`](../venice-audio-speech/SKILL.md) |
| STT | `POST /audio/transcriptions` | [`venice-audio-transcription`](../venice-audio-transcription/SKILL.md) |
| Music (async) | `POST /audio/quote`, `/audio/queue`, `/audio/retrieve`, `/audio/complete` | [`venice-audio-music`](../venice-audio-music/SKILL.md) |
| Video (async) | `POST /video/quote`, `/video/queue`, `/video/retrieve`, `/video/complete`, `/video/transcriptions` | [`venice-video`](../venice-video/SKILL.md) |

### Catalog

| Category | Endpoints | Skill |
|---|---|---|
| Models | `GET /models`, `/models/traits`, `/models/compatibility_mapping` | [`venice-models`](../venice-models/SKILL.md) |
| Characters | `GET /characters`, `/characters/{slug}`, `/characters/{slug}/reviews` | [`venice-characters`](../venice-characters/SKILL.md) |

### Account, billing, wallet

| Category | Endpoints | Skill |
|---|---|---|
| API keys | `GET|POST|DELETE /api_keys`, `/api_keys/{id}`, `/api_keys/rate_limits`, `/api_keys/rate_limits/log`, `/api_keys/generate_web3_key` | [`venice-api-keys`](../venice-api-keys/SKILL.md) |
| Billing | `GET /billing/balance`, `/billing/usage`, `/billing/usage-analytics` | [`venice-billing`](../venice-billing/SKILL.md) |
| x402 wallet | `GET /x402/balance/{wallet}`, `POST /x402/top-up`, `GET /x402/transactions/{wallet}` | [`venice-x402`](../venice-x402/SKILL.md) |

### Utility

| Category | Endpoints | Skill |
|---|---|---|
| Crypto RPC proxy | `GET /crypto/rpc/networks`, `POST /crypto/rpc/{network}` | [`venice-crypto-rpc`](../venice-crypto-rpc/SKILL.md) |
| Augment | `POST /augment/text-parser`, `/augment/scrape`, `/augment/search` | [`venice-augment`](../venice-augment/SKILL.md) |

## Response headers to watch

| Header | When | Meaning |
|---|---|---|
| `X-Balance-Remaining` | x402 inference success | USDC credits left, e.g. `"4.230000"` |
| `X-RateLimit-Limit-*` / `X-RateLimit-Remaining-*` | all inference | your current per-minute/day caps |
| `PAYMENT-REQUIRED` | `402` on x402 inference | base64 JSON with top-up + SIWX challenge (x402 v2) |
| `Content-Encoding` | `200` when client sent `Accept-Encoding: gzip, br` | compression (embeddings, chat) |

## Pricing model at a glance

- Pricing is **dynamic per request**, metered in USD.
- Paid inference endpoints in the spec carry an `x-payment-info` block with `min` and `max` bounds in USD (typically `min: 0.001`, `max: 10.00`; higher for bulk video/audio). Read-only discovery routes like `GET /models`, `/models/traits`, and `/models/compatibility_mapping` do not.
- Pro (Bearer) accounts draw from **DIEM** (staked credits), **USD** balance, and **bundled credits** in priority order.
- x402 (wallet) users draw from a prepaid **USDC credit balance** on Base.
- The authoritative per-model price is on `GET /models` → `model_spec.pricing` (when present — video models omit it; use `/video/quote` for video pricing) (see [`venice-models`](../venice-models/SKILL.md)).

## Standard error shape

Every error body follows one of:

```json
{ "error": "Human-readable message" }
```

or, for 400 validation errors:

```json
{ "error": "...", "details": { "fieldName": { "_errors": ["Field is required"] } } }
```

`402` on x402 adds structured `topUpInstructions` and `siwxChallenge`. See [`venice-errors`](../venice-errors/SKILL.md) for the full table and retry strategy.

## OpenAI compatibility — what works and what doesn't

- Drop-in: `/chat/completions`, `/embeddings`, `/images/generations`, `/audio/speech`, `/audio/transcriptions`, `/models`.
- Ignored but accepted for compat: `user`, `store`.
- Venice-only extensions live under:
  - `venice_parameters` (chat completions)
  - `venice_parameters` is **rejected** on `/responses` — use headers / native fields instead
- Model feature suffixes (e.g. `zai-org-glm-5-1:enable_web_search=on`, `kimi-k2-6:strip_thinking_respon
venice-api-keysSkill

Manage Venice API keys. Covers GET/POST/PATCH/DELETE /api_keys, GET /api_keys/{id}, GET /api_keys/rate_limits, GET /api_keys/rate_limits/log, the two-step /api_keys/generate_web3_key wallet flow, INFERENCE vs ADMIN key types, and per-key consumption limits (USD / DIEM).

venice-audio-musicSkill

Async music / audio-track generation via Venice. Covers the /audio/quote + /audio/queue + /audio/retrieve + /audio/complete lifecycle, lyrics vs instrumental, voice selection, duration, language, speed, model capability probing, and webhook-free polling.

venice-audio-speechSkill

Generate speech from text via POST /audio/speech. Covers TTS models (Kokoro, Qwen 3, xAI, Inworld, Chatterbox, Orpheus, ElevenLabs Turbo, MiniMax, Gemini Flash), voices per family, output formats (mp3/opus/aac/flac/wav/pcm), streaming, prompt/emotion styling, temperature/top_p, and language hints.

venice-audio-transcriptionSkill

Transcribe audio files to text via POST /audio/transcriptions. Covers supported models (Parakeet, Whisper, Wizper, Scribe, xAI STT), supported formats (wav/flac/m4a/aac/mp4/mp3/ogg/webm), response formats (json/text), timestamps, and language hints. OpenAI-compatible multipart.

venice-augmentSkill

Venice augmentation endpoints for agent pipelines. Covers POST /augment/text-parser (extract text from PDF/DOCX/XLSX/plain text, multipart, up to 25MB, JSON or plain text response), POST /augment/scrape (fetch a URL and return markdown; blocks X/Reddit), and POST /augment/search (Brave ZDR or anonymized Google; structured title/url/content/date results, up to 20 per query). Privacy (zero data retention), rate limits, and error shapes.

venice-authSkill

Authenticate to the Venice API with a Bearer API key or with an x402 / SIWE wallet. Covers header formats, the SIWE message fields, TTL and nonce rules, the venice-x402-client SDK, and how to choose between the two modes.

venice-billingSkill

Venice billing and usage analytics - GET /billing/balance, GET /billing/usage (paginated per-request ledger, JSON or CSV), and GET /billing/usage-analytics (aggregated by date/model/key). Covers the DIEM/USD/BUNDLED_CREDITS consumption priority and building dashboards. (Beta)

venice-charactersSkill

Discover and use Venice public characters (persona-driven system prompts with a bound model). Covers GET /characters (search/filter/sort), /characters/{slug}, /characters/{slug}/reviews, the Character schema, and how to apply a character via venice_parameters.character_slug in chat completions.