Skill122 repo starsupdated 8d ago

venice-models

The venice-models skill provides three read-only API endpoints for discovering Venice AI models and their specifications. Use it when you need to programmatically select models based on capabilities like vision or reasoning, validate requests against model constraints such as token limits or resolution requirements, retrieve current pricing information for cost estimation, or resolve user-friendly model names and legacy model IDs to concrete Venice model identifiers.

View source Repository: skills

Install in Claude Code

Copy

git clone --depth 1 https://github.com/veniceai/skills /tmp/venice-models && cp -r /tmp/venice-models/skills/venice-models ~/.claude/skills/venice-models

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# Venice Models

Three read-only endpoints for model discovery — all `GET`:

| Endpoint | Returns |
|---|---|
| `/models` | Full model catalog with `model_spec` (capabilities, constraints, pricing). |
| `/models/traits` | Trait → model ID mapping (e.g. `"default"`, `"fastest"`, `"default_reasoning"`, `"highest_quality"`). |
| `/models/compatibility_mapping` | Legacy / OpenAI / third-party model ID → Venice model ID aliases. |

All three take an optional `?type=` filter: `text`, `image`, `video`, `music`, `tts`, `asr`, `embedding`, `upscale`, `inpaint`, `all`, `code`.

All three are authenticated (Bearer API key or x402 SIWE) like every other `/api/v1` route.

## Use when

- You need to pick a model at runtime based on capabilities (vision, reasoning, function calling, E2EE, X search, multi-image, …).
- You need to validate a request against a model's `constraints` (prompt length, aspect ratio, resolution, steps).
- You need the current **price per million tokens / per image / per second / per 1k chars** to build a cost estimate.
- You want to resolve a user-friendly trait name (e.g. `default`, `default_reasoning`, `highest_quality`) or a frontier-style ID (`openai-gpt-54-pro`, `claude-opus-4-7`) to a concrete Venice model ID.

## `GET /models`

```bash
curl "https://api.venice.ai/api/v1/models?type=text"
```

```json
{
  "object": "list",
  "type": "text",
  "data": [
    {
      "id": "zai-org-glm-5-1",
      "created": 1699000000,
      "model_spec": {
        "name": "GLM 5.1",
        "description": "Balanced blend of speed and capability...",
        "availableContextTokens": 200000,
        "maxCompletionTokens": 24000,
        "privacy": "private",
        "beta": false,
        "betaModel": false,
        "modelSource": "https://huggingface.co/zai-org/GLM-5.1",
        "offline": false,
        "capabilities": { ... },
        "constraints": { ... },
        "pricing": { ... },
        "regionRestrictions": ["US"],
        "deprecation": {"date": "2025-03-01T00:00:00.000Z"}
      }
    }
  ]
}
```

### `model_spec.capabilities` — text models

| Flag | Meaning |
|---|---|
| `optimizedForCode` | Tuned for coding tasks. |
| `quantization` | `fp4` / `fp8` / `fp16` / `bf16` / `int8` / `int4` / `not-available`. |
| `supportsFunctionCalling` | Tools are allowed. |
| `supportsReasoning` | Emits `<thinking>...</thinking>` blocks (and/or provider-specific `reasoning_content`). |
| `supportsReasoningEffort` | Honors `reasoning.effort` / `reasoning_effort`. |
| `supportsResponseSchema` | Honors `response_format: json_schema`. |
| `supportsMultipleImages` + `maxImages` | Multi-image vision support. |
| `supportsVision` | Accepts `image_url` parts. |
| `supportsVideoInput` | Accepts `video_url` parts. |
| `supportsWebSearch` | Honors `venice_parameters.enable_web_search`. |
| `supportsLogProbs` | Honors `logprobs` / `top_logprobs`. |
| `supportsTeeAttestation` | Runs inside a TEE with hardware attestation. |
| `supportsE2EE` | End-to-end encrypted inference available (requires TEE). |
| `supportsXSearch` | xAI native web + X/Twitter search. |
| `supportsAudioInput` | Accepts audio-content message parts (set by the runtime capability builder — not part of the OpenAPI strict schema but appears on `/models` responses). |

### `model_spec.constraints` — by model family

- **Text** — `temperature.default`, `top_p.default`, and `{frequency,presence,repetition}_penalty.default`.
- **Image** — `promptCharacterLimit`, `widthHeightDivisor`, `steps.{default,max}`, optional `aspectRatios[]` + `defaultAspectRatio`, optional `resolutions[]` + `defaultResolution` (the last two appear only on models that use ratio/resolution-based sizing).
- **Video** — `aspect_ratios[]`, `resolutions[]`, `durations[]`, `model_type` (`text-to-video`/`image-to-video`/`video`), `audio`, `audio_configurable`, `prompt_character_limit`.
- **Inpaint / edit** — `aspectRatios[]`, `promptCharacterLimit`, `combineImages`.
- **TTS / Music** (fields surface at the top level of `model_spec`, not inside `constraints`) — `voices[]`, `default_voice`, `supports_lyrics`, `lyrics_required`, `supports_lyrics_optimizer`, `supports_force_instrumental`, `supports_speed`, `supports_language_code`, `min_speed`, `max_speed`, `min_prompt_length`, `prompt_character_limit`. Internal TTS per-model toggles like `supportsPromptParam` / `supportsTemperatureParam` / `supportsTopPParam` exist on the model definitions but are **not** merged into `/models` output today — treat the speech request schema as the support matrix.
- **Embedding** (top-level, not inside `constraints`) — `embeddingDimensions`, `maxInputTokens`, `supportsCustomDimensions`.

### `model_spec.pricing` — by model family

- **LLM** — `input.{usd,diem}`, `output.{usd,diem}` per 1 000 000 tokens, plus optional `cache_input` (reads), `cache_write` (writes, e.g. Anthropic 1.25×), and `extended.*` tier triggered by `context_token_threshold`.
- **Image** — either `generation.{usd,diem}` per image (flat) or `resolutions.<tier>.{usd,diem}` (per `1K`/`2K`/`4K`). Every image row also carries a global `upscale.{2x,4x}.{usd,diem}` block (derived from shared upscale SKUs) — treat it as account-wide upscale pricing, not a signal that this specific model can upscale. Combine with the inpaint/upscale model's own capability check to decide what's actually callable.
- **Inpaint / edit** — `inpaint.{usd,diem}` per edit.
- **Video** — **not currently returned on `/models`.** `calculatePricing()` has no video branch, so video entries have no `model_spec.pricing`. Use `POST /video/quote` for the authoritative per-request price.
- **Music / long audio** — `generation.{usd,diem}` (per job), `per_second.{usd,diem}` (per second generated), `per_thousand_characters.{usd,diem}` (character-priced narration), or `durations.<tier>.{usd,diem,min_seconds,max_seconds}` (duration-bucketed).
- **TTS** — `input.{usd,diem}` per **1 000 000 input characters**.
- **ASR** — `per_audio_second.{usd,diem}`.
- **Embeddings** — `input.{