Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.
Headroom is a Python and TypeScript library that sits between an AI agent and an LLM provider, compressing tool outputs, logs, RAG chunks, files, and conversation history before they reach the model. It ships in three deployment modes: an inline `compress()` function, a drop-in HTTP proxy launched with `headroom proxy --port 8787`, and an MCP server exposing three tools (`headroom_compress`, `headroom_retrieve`, and `headroom_stats`) compatible with Claude Code, Claude Desktop, Cursor, and other MCP clients. The compression pipeline routes content through three specialized algorithms: SmartCrusher for JSON, CodeCompressor for AST-based code, and a locally hosted fine-tuned model called Kompress-base for prose. A CacheAligner stabilizes prompt prefixes to improve provider-side KV cache hit rates. Originals are stored locally under the Compressed Context Retrieval (CCR) system, allowing the LLM to fetch them on demand. The `headroom learn` command mines failed agent sessions and writes corrections directly to `CLAUDE.md` or `AGENTS.md`. Benchmarks on real workloads show token reductions between 47% and 92%, with a code search example shrinking from 17,765 tokens to 1,408.
- ✓Open-source license (Apache-2.0)
- ✓Actively maintained (<30d)
- ✓Healthy fork ratio
- ✓Clear description
- ✓Topics declared
- ✓Documented (README)
claude mcp add headroom -- python -m headroom{
"mcpServers": {
"headroom": {
"command": "python",
"args": ["-m", "headroom.evals"],
"env": {
"COPILOT_PROVIDER_API_URL": "<copilot_provider_api_url>"
}
}
}
}COPILOT_PROVIDER_API_URLResumen de MCP Servers
<div align="center"><pre>
██╗ ██╗███████╗ █████╗ ██████╗ ██████╗ ██████╗ ██████╗ ███╗ ███╗
██║ ██║██╔════╝██╔══██╗██╔══██╗██╔══██╗██╔═══██╗██╔═══██╗████╗ ████║
███████║█████╗ ███████║██║ ██║██████╔╝██║ ██║██║ ██║██╔████╔██║
██╔══██║██╔══╝ ██╔══██║██║ ██║██╔══██╗██║ ██║██║ ██║██║╚██╔╝██║
██║ ██║███████╗██║ ██║██████╔╝██║ ██║╚██████╔╝╚██████╔╝██║ ╚═╝ ██║
╚═╝ ╚═╝╚══════╝╚═╝ ╚═╝╚═════╝ ╚═╝ ╚═╝ ╚═════╝ ╚═════╝ ╚═╝ ╚═╝
The context compression layer for AI agents
</pre></div>
<p align="center"><strong>60–95% fewer tokens · library · proxy · MCP · 6 algorithms · local-first · reversible</strong></p>
<p align="center">
<a href="https://github.com/chopratejas/headroom/actions/workflows/ci.yml"><img src="https://github.com/chopratejas/headroom/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
<a href="https://app.codecov.io/gh/chopratejas/headroom"><img src="https://codecov.io/gh/chopratejas/headroom/graph/badge.svg" alt="codecov"></a>
<a href="https://pypi.org/project/headroom-ai/"><img src="https://img.shields.io/pypi/v/headroom-ai.svg" alt="PyPI"></a>
<a href="https://www.npmjs.com/package/headroom-ai"><img src="https://img.shields.io/npm/v/headroom-ai.svg" alt="npm"></a>
<a href="https://huggingface.co/chopratejas/kompress-v2-base"><img src="https://img.shields.io/badge/model-Kompress--v2--base-yellow.svg" alt="Model: Kompress-v2-base"></a>
<a href="LICENSE"><img src="https://img.shields.io/badge/license-Apache%202.0-blue.svg" alt="License: Apache 2.0"></a>
<a href="https://headroom-docs.vercel.app/docs"><img src="https://img.shields.io/badge/docs-online-blue.svg" alt="Docs"></a>
</p>
<p align="center">
<a href="https://headroom-docs.vercel.app/docs">Docs</a> ·
<a href="#get-started-60-seconds">Install</a> ·
<a href="#proof">Proof</a> ·
<a href="#agent-compatibility-matrix">Agents</a> ·
<a href="https://discord.gg/yRmaUNpsPJ">Discord</a> ·
<a href="llms.txt">llms.txt</a> ·
<a href="ENTERPRISE.md">Enterprise</a>
</p>
<p align="center"><sub>
<b>AI agents / LLMs:</b> read <a href="llms.txt"><code>/llms.txt</code></a> here, or fetch <a href="https://headroom-docs.vercel.app/llms.txt">the live index</a> / <a href="https://headroom-docs.vercel.app/llms-full.txt">full docs blob</a>.
</sub></p>
---
<p align="center"><a href="https://trendshift.io/repositories/20881" target="_blank"><img src="https://trendshift.io/api/badge/repositories/20881" alt="chopratejas%2Fheadroom | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a></p>
Headroom compresses everything your AI agent reads — tool outputs, logs, RAG chunks, files, and conversation history — before it reaches the LLM. Same answers, fraction of the tokens.
<p align="center">
<img src="HeadroomDemo-Fast.gif" alt="Headroom in action" width="820">
<br/><sub>Live: 10,144 → 1,260 tokens — same FATAL found.</sub>
</p>
## What it does
- **Library** — `compress(messages)` in Python or TypeScript, inline in any app
- **Proxy** — `headroom proxy --port 8787`, zero code changes, any language
- **Agent wrap** — `headroom wrap claude|codex|cursor|aider|copilot` in one command
- **MCP server** — `headroom_compress`, `headroom_retrieve`, `headroom_stats` for any MCP client
- **Cross-agent memory** — shared store across Claude, Codex, Gemini, auto-dedup
- **`headroom learn`** — mines failed sessions, writes corrections to `CLAUDE.md` / `AGENTS.md`
- **Reversible (CCR)** — originals are cached for retrieval on demand
## How it works (30 seconds)
```
Your agent / app
(Claude Code, Cursor, Codex, LangChain, Agno, Strands, your own code…)
│ prompts · tool outputs · logs · RAG results · files
▼
┌────────────────────────────────────────────────────┐
│ Headroom (runs locally — your data stays here) │
│ ──────────────────────────────────────────────── │
│ CacheAligner → ContentRouter → CCR │
│ ├─ SmartCrusher (JSON) │
│ ├─ CodeCompressor (AST) │
│ └─ Kompress-base (text, HF) │
│ │
│ Cross-agent memory · headroom learn · MCP │
└────────────────────────────────────────────────────┘
│ compressed prompt + retrieval tool
▼
LLM provider (Anthropic · OpenAI · Bedrock · …)
```
- **ContentRouter** — detects content type, selects the right compressor
- **SmartCrusher / CodeCompressor / Kompress-base** — compress JSON, AST, or prose
- **CacheAligner** — stabilizes prefixes so provider KV caches actually hit
- **CCR** — stores originals locally; LLM calls `headroom_retrieve` if it needs them
→ [Architecture](https://headroom-docs.vercel.app/docs/architecture) · [CCR reversible compression](https://headroom-docs.vercel.app/docs/ccr) · [Kompress-v2-base model card](https://huggingface.co/chopratejas/kompress-v2-base)
## Get started (60 seconds)
```bash
# 1 — Install
pip install "headroom-ai[all]" # Python
npm install headroom-ai # Node / TypeScript
# 2 — Pick your mode
headroom wrap claude # wrap a coding agent
headroom proxy --port 8787 # drop-in proxy, zero code changes
# or: from headroom import compress # inline library
# 3 — See the savings
headroom perf
```
Granular extras: `[proxy]`, `[mcp]`, `[ml]`, `[code]`, `[memory]`, `[relevance]`, `[image]`, `[agno]`, `[langchain]`, `[evals]`, `[pytorch-mps]` (Apple-GPU memory-embedder offload — set `HEADROOM_EMBEDDER_RUNTIME=pytorch_mps`). Requires **Python 3.10+**.
## Proof
**Savings on real agent workloads:**
| Workload | Before | After | Savings |
|-------------------------------|-------:|-------:|--------:|
| Code search (100 results) | 17,765 | 1,408 | **92%** |
| SRE incident debugging | 65,694 | 5,118 | **92%** |
| GitHub issue triage | 54,174 | 14,761 | **73%** |
| Codebase exploration | 78,502 | 41,254 | **47%** |
**Accuracy preserved on standard benchmarks:**
| Benchmark | Category | N | Baseline | Headroom | Delta |
|------------|----------|----:|---------:|---------:|------------|
| GSM8K | Math | 100 | 0.870 | 0.870 | **±0.000** |
| TruthfulQA | Factual | 100 | 0.530 | 0.560 | **+0.030** |
| SQuAD v2 | QA | 100 | — | **97%** | 19% compression |
| BFCL | Tools | 100 | — | **97%** | 32% compression |
Reproduce: `python -m headroom.evals suite --tier 1` · [Full benchmarks & methodology](https://headroom-docs.vercel.app/docs/benchmarks)
<a href="https://www.star-history.com/?repos=chopratejas%2Fheadroom&type=date&legend=top-left">
<picture>
<img alt="Star History Chart" src="https://api.star-history.com/chart?repos=chopratejas/headroom&type=date&legend=top-left" />
</picture>
</a>
## Agent compatibility matrix
| Agent | `headroom wrap` | Notes |
|-------------|:---------------:|----------------------------------|
| Claude Code | ✅ | `--memory` · `--code-graph` |
| Codex | ✅ | shares memory with Claude |
| Cursor | ✅ | prints config — paste once |
| Aider | ✅ | starts proxy + launches |
| Copilot CLI | ✅ | starts proxy + launches |
| OpenClaw | ✅ | installs as ContextEngine plugin |
Any OpenAI-compatible client works via `headroom proxy`. MCP-native: `headroom mcp install`.
### GitHub Copilot CLI subscription mode
Headroom can route GitHub Copilot CLI subscription traffic through the local proxy:
```bash
headroom copilot-auth login
headroom wrap copilot --subscription -- --model gpt-4o
```
This lets Headroom intercept OpenAI-compatible Copilot CLI requests and apply the same proxy compression pipeline before forwarding to GitHub Copilot's hosted API. The wrapper exchanges Headroom's reusable GitHub OAuth token for Copilot's short-lived API token and prints the upstream endpoint as `COPILOT_PROVIDER_API_URL=...` during launch.
`headroom copilot-auth login` stores a Headroom-specific Copilot OAuth token.
This avoids relying on generic GitHub or Copilot CLI tokens that can read
Copilot account metadata but may still be rejected by Copilot's token-exchange
endpoint.
For GitHub Enterprise Server or custom-domain Copilot deployments, set the
deployment domain before launching:
```bash
export GITHUB_COPILOT_ENTERPRISE_DOMAIN=ghe.example.com
```
For GitHub.com Enterprise Cloud URLs such as
`github.com/enterprises/your-enterprise`, do not set an enterprise-domain
override. Headroom uses GitHub's normal token-exchange endpoint and the Copilot
API endpoint advertised for the signed-in account.
Platform support note: macOS auth reuse via Copilot CLI Keychain storage has been smoke-tested. Windows Credential Manager, Linux Secret Service / `secret-tool`, and Docker/CI token-injection paths are implemented or planned as auth-discovery paths, but still need real OS validation before they should be considered fully vetted. For Docker and CI, prefer passing an explicit `GITHUB_COPILOT_TOKEN` or `GITHUB_COPILOT_GITHUB_TOKEN` rather than relying on host keychain access.
## When to use · When to skip
**Great fit if you…**
- run AI coding agents daily and want savings without changing your code
- work across multiple agents and want shared memory
- need reversible compression — originals are retrievable via CCR within the configured TTL
**Skip it if you…**
- only use a single provider's native compaction and don't need cross-agent memory
- work in a sandboxed environment where local processes can't run
<details>
<summary><b>Integrations — drop Headroom into any stack</b></summary>
| Your setup | Hook in with |
|------------------------|------------------------------------Lo que la gente pregunta sobre headroom
¿Qué es chopratejas/headroom?
+
chopratejas/headroom es mcp servers para el ecosistema de Claude AI. Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server. Tiene 24.9k estrellas en GitHub y se actualizó por última vez today.
¿Cómo se instala headroom?
+
Puedes instalar headroom clonando el repositorio (https://github.com/chopratejas/headroom) o siguiendo las instrucciones del README en GitHub. ClaudeWave también te ofrece bloques de instalación rápida en esta misma página.
¿Es seguro usar chopratejas/headroom?
+
Nuestro agente de seguridad ha analizado chopratejas/headroom y le ha asignado un Trust Score de 100/100 (tier: Verified). Revisa el desglose completo de comprobaciones superadas y flags en esta página.
¿Quién mantiene chopratejas/headroom?
+
chopratejas/headroom es mantenido por chopratejas. La última actividad registrada en GitHub es de today, con 232 issues abiertos.
¿Hay alternativas a headroom?
+
Sí. En ClaudeWave puedes explorar mcp servers similares en /categories/mcp, ordenados por popularidad o actividad reciente.
Despliega headroom en tu cloud
Lleva este repo a producción en minutos. Cada plataforma genera su propio entorno con variables de entorno editables.
¿Mantienes este repo? Añade un badge a tu README
Pega el badge en tu README de GitHub para mostrar que está auditado por ClaudeWave. Cada badge enlaza de vuelta a esta página y muestra el Trust Score actual.
[](https://claudewave.com/repo/chopratejas-headroom)<a href="https://claudewave.com/repo/chopratejas-headroom"><img src="https://claudewave.com/api/badge/chopratejas-headroom" alt="Featured on ClaudeWave: chopratejas/headroom" width="320" height="64" /></a>Más MCP Servers
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
An open-source AI agent that brings the power of Gemini directly into your terminal.
The fastest path to AI-powered full stack observability, even for lean teams.
🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!
⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载,你的 AI 舆情监控助手与热点筛选工具!聚合多平台热点 + RSS 订阅,支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机,也支持接入 MCP 架构,赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ,数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。