Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.
Headroom is a Python and TypeScript library that sits between an AI agent and an LLM provider, compressing tool outputs, logs, RAG chunks, files, and conversation history before they reach the model. It ships in three deployment modes: an inline `compress()` function, a drop-in HTTP proxy launched with `headroom proxy --port 8787`, and an MCP server exposing three tools (`headroom_compress`, `headroom_retrieve`, and `headroom_stats`) compatible with Claude Code, Claude Desktop, Cursor, and other MCP clients. The compression pipeline routes content through three specialized algorithms: SmartCrusher for JSON, CodeCompressor for AST-based code, and a locally hosted fine-tuned model called Kompress-base for prose. A CacheAligner stabilizes prompt prefixes to improve provider-side KV cache hit rates. Originals are stored locally under the Compressed Context Retrieval (CCR) system, allowing the LLM to fetch them on demand. The `headroom learn` command mines failed agent sessions and writes corrections directly to `CLAUDE.md` or `AGENTS.md`. Benchmarks on real workloads show token reductions between 47% and 92%, with a code search example shrinking from 17,765 tokens to 1,408.
- ✓Open-source license (Apache-2.0)
- ✓Actively maintained (<30d)
- ✓Healthy fork ratio
- ✓Clear description
- ✓Topics declared
- ✓Documented (README)
claude mcp add headroom -- python -m headroom{
"mcpServers": {
"headroom": {
"command": "python",
"args": ["-m", "headroom.evals"],
"env": {
"COPILOT_PROVIDER_API_URL": "<copilot_provider_api_url>"
}
}
}
}COPILOT_PROVIDER_API_URLMCP Servers overview
<div align="center"><pre>
██╗ ██╗███████╗ █████╗ ██████╗ ██████╗ ██████╗ ██████╗ ███╗ ███╗
██║ ██║██╔════╝██╔══██╗██╔══██╗██╔══██╗██╔═══██╗██╔═══██╗████╗ ████║
███████║█████╗ ███████║██║ ██║██████╔╝██║ ██║██║ ██║██╔████╔██║
██╔══██║██╔══╝ ██╔══██║██║ ██║██╔══██╗██║ ██║██║ ██║██║╚██╔╝██║
██║ ██║███████╗██║ ██║██████╔╝██║ ██║╚██████╔╝╚██████╔╝██║ ╚═╝ ██║
╚═╝ ╚═╝╚══════╝╚═╝ ╚═╝╚═════╝ ╚═╝ ╚═╝ ╚═════╝ ╚═════╝ ╚═╝ ╚═╝
The context compression layer for AI agents
</pre></div>
<p align="center"><strong>60–95% fewer tokens · library · proxy · MCP · 6 algorithms · local-first · reversible</strong></p>
<p align="center">
<a href="https://github.com/chopratejas/headroom/actions/workflows/ci.yml"><img src="https://github.com/chopratejas/headroom/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
<a href="https://app.codecov.io/gh/chopratejas/headroom"><img src="https://codecov.io/gh/chopratejas/headroom/graph/badge.svg" alt="codecov"></a>
<a href="https://pypi.org/project/headroom-ai/"><img src="https://img.shields.io/pypi/v/headroom-ai.svg" alt="PyPI"></a>
<a href="https://www.npmjs.com/package/headroom-ai"><img src="https://img.shields.io/npm/v/headroom-ai.svg" alt="npm"></a>
<a href="https://huggingface.co/chopratejas/kompress-v2-base"><img src="https://img.shields.io/badge/model-Kompress--v2--base-yellow.svg" alt="Model: Kompress-v2-base"></a>
<a href="LICENSE"><img src="https://img.shields.io/badge/license-Apache%202.0-blue.svg" alt="License: Apache 2.0"></a>
<a href="https://headroom-docs.vercel.app/docs"><img src="https://img.shields.io/badge/docs-online-blue.svg" alt="Docs"></a>
</p>
<p align="center">
<a href="https://headroom-docs.vercel.app/docs">Docs</a> ·
<a href="#get-started-60-seconds">Install</a> ·
<a href="#proof">Proof</a> ·
<a href="#agent-compatibility-matrix">Agents</a> ·
<a href="https://discord.gg/yRmaUNpsPJ">Discord</a> ·
<a href="llms.txt">llms.txt</a> ·
<a href="ENTERPRISE.md">Enterprise</a>
</p>
<p align="center"><sub>
<b>AI agents / LLMs:</b> read <a href="llms.txt"><code>/llms.txt</code></a> here, or fetch <a href="https://headroom-docs.vercel.app/llms.txt">the live index</a> / <a href="https://headroom-docs.vercel.app/llms-full.txt">full docs blob</a>.
</sub></p>
---
<p align="center"><a href="https://trendshift.io/repositories/20881" target="_blank"><img src="https://trendshift.io/api/badge/repositories/20881" alt="chopratejas%2Fheadroom | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a></p>
Headroom compresses everything your AI agent reads — tool outputs, logs, RAG chunks, files, and conversation history — before it reaches the LLM. Same answers, fraction of the tokens.
<p align="center">
<img src="HeadroomDemo-Fast.gif" alt="Headroom in action" width="820">
<br/><sub>Live: 10,144 → 1,260 tokens — same FATAL found.</sub>
</p>
## What it does
- **Library** — `compress(messages)` in Python or TypeScript, inline in any app
- **Proxy** — `headroom proxy --port 8787`, zero code changes, any language
- **Agent wrap** — `headroom wrap claude|codex|cursor|aider|copilot` in one command
- **MCP server** — `headroom_compress`, `headroom_retrieve`, `headroom_stats` for any MCP client
- **Cross-agent memory** — shared store across Claude, Codex, Gemini, auto-dedup
- **`headroom learn`** — mines failed sessions, writes corrections to `CLAUDE.md` / `AGENTS.md`
- **Reversible (CCR)** — originals are cached for retrieval on demand
## How it works (30 seconds)
```
Your agent / app
(Claude Code, Cursor, Codex, LangChain, Agno, Strands, your own code…)
│ prompts · tool outputs · logs · RAG results · files
▼
┌────────────────────────────────────────────────────┐
│ Headroom (runs locally — your data stays here) │
│ ──────────────────────────────────────────────── │
│ CacheAligner → ContentRouter → CCR │
│ ├─ SmartCrusher (JSON) │
│ ├─ CodeCompressor (AST) │
│ └─ Kompress-base (text, HF) │
│ │
│ Cross-agent memory · headroom learn · MCP │
└────────────────────────────────────────────────────┘
│ compressed prompt + retrieval tool
▼
LLM provider (Anthropic · OpenAI · Bedrock · …)
```
- **ContentRouter** — detects content type, selects the right compressor
- **SmartCrusher / CodeCompressor / Kompress-base** — compress JSON, AST, or prose
- **CacheAligner** — stabilizes prefixes so provider KV caches actually hit
- **CCR** — stores originals locally; LLM calls `headroom_retrieve` if it needs them
→ [Architecture](https://headroom-docs.vercel.app/docs/architecture) · [CCR reversible compression](https://headroom-docs.vercel.app/docs/ccr) · [Kompress-v2-base model card](https://huggingface.co/chopratejas/kompress-v2-base)
## Get started (60 seconds)
```bash
# 1 — Install
pip install "headroom-ai[all]" # Python
npm install headroom-ai # Node / TypeScript
# 2 — Pick your mode
headroom wrap claude # wrap a coding agent
headroom proxy --port 8787 # drop-in proxy, zero code changes
# or: from headroom import compress # inline library
# 3 — See the savings
headroom perf
```
Granular extras: `[proxy]`, `[mcp]`, `[ml]`, `[code]`, `[memory]`, `[relevance]`, `[image]`, `[agno]`, `[langchain]`, `[evals]`, `[pytorch-mps]` (Apple-GPU memory-embedder offload — set `HEADROOM_EMBEDDER_RUNTIME=pytorch_mps`). Requires **Python 3.10+**.
## Proof
**Savings on real agent workloads:**
| Workload | Before | After | Savings |
|-------------------------------|-------:|-------:|--------:|
| Code search (100 results) | 17,765 | 1,408 | **92%** |
| SRE incident debugging | 65,694 | 5,118 | **92%** |
| GitHub issue triage | 54,174 | 14,761 | **73%** |
| Codebase exploration | 78,502 | 41,254 | **47%** |
**Accuracy preserved on standard benchmarks:**
| Benchmark | Category | N | Baseline | Headroom | Delta |
|------------|----------|----:|---------:|---------:|------------|
| GSM8K | Math | 100 | 0.870 | 0.870 | **±0.000** |
| TruthfulQA | Factual | 100 | 0.530 | 0.560 | **+0.030** |
| SQuAD v2 | QA | 100 | — | **97%** | 19% compression |
| BFCL | Tools | 100 | — | **97%** | 32% compression |
Reproduce: `python -m headroom.evals suite --tier 1` · [Full benchmarks & methodology](https://headroom-docs.vercel.app/docs/benchmarks)
<a href="https://www.star-history.com/?repos=chopratejas%2Fheadroom&type=date&legend=top-left">
<picture>
<img alt="Star History Chart" src="https://api.star-history.com/chart?repos=chopratejas/headroom&type=date&legend=top-left" />
</picture>
</a>
## Agent compatibility matrix
| Agent | `headroom wrap` | Notes |
|-------------|:---------------:|----------------------------------|
| Claude Code | ✅ | `--memory` · `--code-graph` |
| Codex | ✅ | shares memory with Claude |
| Cursor | ✅ | prints config — paste once |
| Aider | ✅ | starts proxy + launches |
| Copilot CLI | ✅ | starts proxy + launches |
| OpenClaw | ✅ | installs as ContextEngine plugin |
Any OpenAI-compatible client works via `headroom proxy`. MCP-native: `headroom mcp install`.
### GitHub Copilot CLI subscription mode
Headroom can route GitHub Copilot CLI subscription traffic through the local proxy:
```bash
headroom copilot-auth login
headroom wrap copilot --subscription -- --model gpt-4o
```
This lets Headroom intercept OpenAI-compatible Copilot CLI requests and apply the same proxy compression pipeline before forwarding to GitHub Copilot's hosted API. The wrapper exchanges Headroom's reusable GitHub OAuth token for Copilot's short-lived API token and prints the upstream endpoint as `COPILOT_PROVIDER_API_URL=...` during launch.
`headroom copilot-auth login` stores a Headroom-specific Copilot OAuth token.
This avoids relying on generic GitHub or Copilot CLI tokens that can read
Copilot account metadata but may still be rejected by Copilot's token-exchange
endpoint.
For GitHub Enterprise Server or custom-domain Copilot deployments, set the
deployment domain before launching:
```bash
export GITHUB_COPILOT_ENTERPRISE_DOMAIN=ghe.example.com
```
For GitHub.com Enterprise Cloud URLs such as
`github.com/enterprises/your-enterprise`, do not set an enterprise-domain
override. Headroom uses GitHub's normal token-exchange endpoint and the Copilot
API endpoint advertised for the signed-in account.
Platform support note: macOS auth reuse via Copilot CLI Keychain storage has been smoke-tested. Windows Credential Manager, Linux Secret Service / `secret-tool`, and Docker/CI token-injection paths are implemented or planned as auth-discovery paths, but still need real OS validation before they should be considered fully vetted. For Docker and CI, prefer passing an explicit `GITHUB_COPILOT_TOKEN` or `GITHUB_COPILOT_GITHUB_TOKEN` rather than relying on host keychain access.
## When to use · When to skip
**Great fit if you…**
- run AI coding agents daily and want savings without changing your code
- work across multiple agents and want shared memory
- need reversible compression — originals are retrievable via CCR within the configured TTL
**Skip it if you…**
- only use a single provider's native compaction and don't need cross-agent memory
- work in a sandboxed environment where local processes can't run
<details>
<summary><b>Integrations — drop Headroom into any stack</b></summary>
| Your setup | Hook in with |
|------------------------|------------------------------------What people ask about headroom
What is chopratejas/headroom?
+
chopratejas/headroom is mcp servers for the Claude AI ecosystem. Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server. It has 24.9k GitHub stars and was last updated today.
How do I install headroom?
+
You can install headroom by cloning the repository (https://github.com/chopratejas/headroom) or following the README instructions on GitHub. ClaudeWave also provides quick install blocks on this page.
Is chopratejas/headroom safe to use?
+
Our security agent has analyzed chopratejas/headroom and assigned a Trust Score of 100/100 (tier: Verified). See the full breakdown of passed checks and flags on this page.
Who maintains chopratejas/headroom?
+
chopratejas/headroom is maintained by chopratejas. The last recorded GitHub activity is from today, with 232 open issues.
Are there alternatives to headroom?
+
Yes. On ClaudeWave you can browse similar mcp servers at /categories/mcp, sorted by popularity or recent activity.
Deploy headroom to your cloud
Ship this repo to production in minutes. Each platform spins up its own environment with editable env vars.
Maintain this repo? Add a badge to your README
Drop the badge into your GitHub README to show it's tracked on ClaudeWave. Each badge links back to this page and reflects the live Trust Score.
[](https://claudewave.com/repo/chopratejas-headroom)<a href="https://claudewave.com/repo/chopratejas-headroom"><img src="https://claudewave.com/api/badge/chopratejas-headroom" alt="Featured on ClaudeWave: chopratejas/headroom" width="320" height="64" /></a>More MCP Servers
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
An open-source AI agent that brings the power of Gemini directly into your terminal.
The fastest path to AI-powered full stack observability, even for lean teams.
🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!
⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载,你的 AI 舆情监控助手与热点筛选工具!聚合多平台热点 + RSS 订阅,支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机,也支持接入 MCP 架构,赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ,数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。