omni-compression
OmniRoute's omni-compression skill configures three token-reduction modes for API requests: RTK for structured output (60–90% savings), Caveman for prose and chat (46% savings), and stacked compression combining both (78–95% savings). Use this skill to transparently compress payloads before forwarding to language model providers, reducing token costs without modifying application code.
git clone --depth 1 https://github.com/diegosouzapw/OmniRoute /tmp/omni-compression && cp -r /tmp/omni-compression/skills/omni-compression ~/.claude/skills/omni-compressionSKILL.md
<!-- generated by src/lib/agentSkills/generator.ts; manual edits will be overwritten -->
## Overview
Configure RTK (command output), Caveman (prose), and stacked compression modes. Manage language packs, custom rules, and test prompt compression reducing tokens by 60–90%.
## Authentication
All requests require a valid Bearer token or session cookie. Obtain a token via `POST /api/auth/login` or configure `REQUIRE_API_KEY=false` for local development.
## Endpoints
### POST /api/compression/preview
Preview compression for a message payload
```bash
curl -X POST https://localhost:20128/api/compression/preview \
-H "Authorization: Bearer $OMNIROUTE_TOKEN"
-H "Content-Type: application/json" \
-d '{}'
```
### GET /api/compression/language-packs
List Caveman compression language packs
```bash
curl https://localhost:20128/api/compression/language-packs \
-H "Authorization: Bearer $OMNIROUTE_TOKEN"
```
### GET /api/compression/rules
List Caveman compression rule metadata
```bash
curl https://localhost:20128/api/compression/rules \
-H "Authorization: Bearer $OMNIROUTE_TOKEN"
```
## Payloads
See the full OpenAPI specification at `GET /api/openapi/spec` or `docs/reference/openapi.yaml` for detailed request/response schemas.
<!-- skill:custom-start -->
<!-- Migrated from skills/omniroute-compression/SKILL.md (preserved curated content) -->
# OmniRoute — Compression
Requires `OMNIROUTE_URL` and `OMNIROUTE_KEY`. See [entry-point SKILL](https://raw.githubusercontent.com/diegosouzapw/OmniRoute/main/skills/omniroute/SKILL.md) for setup.
## Overview
OmniRoute compresses token payloads before forwarding to providers. No code changes required — set it once, it applies to all requests transparently.
| Engine | Best for | Typical savings |
| ------------------------- | ------------------------------------ | --------------- |
| RTK | Terminal / build / test / git output | 60–90% |
| Caveman | Human prose, chat history | 46% input |
| Stacked (`rtk → caveman`) | Mixed coding sessions | 78–95% |
| MCP accessibility filter | Browser/accessibility tool results | 60–80% |
## Get current settings
```bash
curl $OMNIROUTE_URL/api/settings/compression \
-H "Authorization: Bearer $OMNIROUTE_KEY"
```
## Enable RTK (best for coding agents)
```bash
curl -X PUT $OMNIROUTE_URL/api/settings/compression \
-H "Authorization: Bearer $OMNIROUTE_KEY" \
-H "Content-Type: application/json" \
-d '{ "mode": "rtk", "enabled": true }'
```
## Enable stacked mode (maximum savings)
```bash
curl -X PUT $OMNIROUTE_URL/api/settings/compression \
-H "Authorization: Bearer $OMNIROUTE_KEY" \
-H "Content-Type: application/json" \
-d '{
"mode": "stacked",
"enabled": true,
"stackedPipeline": ["rtk", "caveman"]
}'
```
## Enable Caveman (prose / chat)
```bash
curl -X PUT $OMNIROUTE_URL/api/settings/compression \
-H "Authorization: Bearer $OMNIROUTE_KEY" \
-H "Content-Type: application/json" \
-d '{ "mode": "standard", "enabled": true }'
```
Caveman intensities: `lite` (safe), `standard` (balanced), `aggressive` (long sessions), `ultra` (context recovery).
## Preview compression before enabling
```bash
curl -X POST $OMNIROUTE_URL/api/compression/preview \
-H "Authorization: Bearer $OMNIROUTE_KEY" \
-H "Content-Type: application/json" \
-d '{
"mode": "rtk",
"text": "$ npm test\n> jest\n\nPASS src/a.test.ts (2.1s)\nPASS src/b.test.ts (1.8s)\n..."
}'
```
Response includes `compressed`, `original_length`, `compressed_length`, `savings_pct`.
## MCP accessibility-tree filter (browser agent use)
When OmniRoute is used with browser/Playwright MCP tools, it automatically compresses verbose accessibility-tree tool results. Enabled by default; configure thresholds:
```bash
curl -X PUT $OMNIROUTE_URL/api/settings/compression \
-H "Authorization: Bearer $OMNIROUTE_KEY" \
-H "Content-Type: application/json" \
-d '{
"mcpAccessibility": {
"enabled": true,
"collapseThreshold": 30,
"maxTextChars": 50000
}
}'
```
`collapseThreshold`: collapse sibling lines when ≥ N repeats (default 30).
`maxTextChars`: hard truncate after N chars with navigation hint (default 50000).
## Language packs (Caveman)
Caveman supports language-aware rules for pt-BR, es, de, fr, ja:
```bash
curl -X PUT $OMNIROUTE_URL/api/settings/compression \
-H "Authorization: Bearer $OMNIROUTE_KEY" \
-H "Content-Type: application/json" \
-d '{
"mode": "standard",
"cavemanConfig": {
"language": "pt-BR",
"autoDetectLanguage": true
}
}'
```
## Via MCP
```
omniroute_compression_status → current settings + savings analytics
omniroute_compression_configure → update mode/threshold/language
omniroute_set_compression_engine → switch engine at runtime
```
## Disable compression
```bash
curl -X PUT $OMNIROUTE_URL/api/settings/compression \
-H "Authorization: Bearer $OMNIROUTE_KEY" \
-d '{ "enabled": false }'
```
## Errors
- `400 invalid mode` → use `off`, `lite`, `standard`, `aggressive`, `ultra`, `rtk`, or `stacked`
- `400 invalid stackedPipeline` → array must contain valid engine ids (`rtk`, `caveman`)
<!-- skill:custom-end -->Interact with the OmniRoute A2A server from the CLI. Send tasks, inspect skill execution history, and test the JSON-RPC 2.0 agent-to-agent protocol interactively.
Backup and restore OmniRoute data from the CLI. Trigger incremental snapshots, sync to cloud storage, manage backup schedules, and restore from archive files.
Submit and monitor batch inference jobs from the CLI. Upload and manage files for batch processing, retrieve results, and integrate batch pipelines with CI/CD workflows.
Send chat completions, stream responses, and start an interactive REPL session from the CLI. Supports all OmniRoute providers, combo routing, and system prompt configuration.
Configure and test prompt compression from the CLI. Manage RTK filters, Caveman rules, stacked compression modes, and preview compression output with real prompts.
Manage context engineering configurations, RTK filter sets, and conversation sessions from the CLI. Apply context-relay settings and inspect active context pipelines.
View cost breakdowns, token usage, and call logs from the CLI. Filter by provider, model, or date range. Export usage reports and inspect per-connection spending.
Create and run evaluation suites, watch live benchmark progress, view scorecards, compare model performance, and integrate eval runs with CI workflows from the CLI.