flatten-mcp

Name: shayaShav/flatten-mcp
Author: shayaShav

Flatten Claude Code sessions: keep every prompt and event verbatim, resume at a lower token count. An MCP server.

MCP ServersOfficial Registry0 stars0 forks● TypeScriptMITUpdated today

Install in Claude Code / Claude Desktop

Method: NPX · flatten-mcp

Claude Code CLI

claude mcp add flatten-mcp -- npx -y flatten-mcp

claude_desktop_config.json (Claude Desktop)

{
  "mcpServers": {
    "flatten-mcp": {
      "command": "npx",
      "args": ["-y", "flatten-mcp"]
    }
  }
}

1. Run the command above in your terminal (Claude Code), or paste the JSON config into claude_desktop_config.json (Claude Desktop).

2. Replace any <placeholder> values with your API keys or paths.

3. Restart Claude. The MCP server and its tools appear automatically.

Use cases

Dev Tools Creative Data & DB

About

MCP Servers overview

<p align="center">
    <img src="https://raw.githubusercontent.com/shayaShav/flatten-mcp/main/assets/logo.png" alt="flatten-mcp logo" width="160">
</p>

# flatten-mcp

> Resume the **exact same conversation** at a lower token cost — without compacting it into a lossy summary.

<p align="left">
  <a href="https://www.npmjs.com/package/flatten-mcp"><img alt="npm version" src="https://img.shields.io/npm/v/flatten-mcp.svg"></a>
  <a href="https://www.npmjs.com/package/flatten-mcp"><img alt="npm downloads" src="https://img.shields.io/npm/dm/flatten-mcp.svg"></a>
  <a href="https://github.com/shayaShav/flatten-mcp/blob/main/LICENSE"><img alt="License: MIT" src="https://img.shields.io/badge/License-MIT-green.svg"></a>
  <a href="https://nodejs.org"><img alt="Node &gt;= 18" src="https://img.shields.io/badge/node-%3E%3D18-339933?logo=nodedotjs&amp;logoColor=white"></a>
  <a href="https://modelcontextprotocol.io"><img alt="Model Context Protocol" src="https://img.shields.io/badge/MCP-server-6E56CF.svg"></a>
  <a href="https://docs.claude.com/en/docs/claude-code"><img alt="Built for Claude Code" src="https://img.shields.io/badge/built%20for-Claude%20Code-D97757.svg"></a>
  <a href="https://smithery.ai/server/@shaya-shaviv/flatten-mcp"><img alt="Smithery calls" src="https://smithery.ai/badge/@shaya-shaviv/flatten-mcp"></a>
</p>

**flatten-mcp** is a [Model Context Protocol](https://modelcontextprotocol.io) server for [Claude Code](https://docs.claude.com/en/docs/claude-code). It shrinks a session's token footprint by moving bulky tool output (large file reads, command logs, base64 screenshots) out of the conversation and into a sidecar file — leaving a tiny, retrievable reference in its place. Your prompts and the chronological flow of the session are preserved **verbatim** — those lines are never rewritten. You resume the same raw conversation; it just costs less to carry.

See how 317,236 tokens turned into 182,287:

https://github.com/user-attachments/assets/4672b3cd-f78f-4146-97ba-e0077b655381

---

## Why flatten instead of compact?

The standard answer to a full context window is **compaction**: the model reads the whole conversation and rewrites it into a shorter summary. That summary is lossy by construction — an *interpretation* of your history, and interpretations drift, smooth over the awkward parts, and quietly drop the detail you didn't know you'd need. But the history is exactly what's worth keeping verbatim: the words you typed at 2 a.m., the precise order of events, the dead ends and the decisions. A fuzzy, half-formed prompt carries more raw truth about your intent than any tidy paragraph written *about* it after the fact — and preserving it untouched is the foundation of trust in a coding agent.

**Flattening is the opposite move.** It changes *nothing* about what was said. In most sessions the model reads a lot — large files, long logs, multiple sources — and keeps every byte of it in context, even though it has nearly always already **written down the conclusion in plain prose**: the one line that mattered in a 2 MB log, the finding distilled from five files, the running tally of open tasks. The raw source has done its job. Flattening lifts those already-summarized blocks out and swaps each for a lightweight reference ID — so starting cold from a flattened session is usually smooth sailing, and on the rare occasion the raw bytes *are* needed, they're one `retrieve_flattened` call away.

```
What sits in the context window:

   USER         "fix the crash"
   ASSISTANT    reading the logs…
   TOOL_RESULT  ▓▓▓ 2 MB log dump ▓▓▓        ← bulk; already summarized in prose below
   ASSISTANT    "the OOM is at line 88,402 — the fix is …"

After flatten — same words, only the bulk set aside:

   USER         "fix the crash"
   ASSISTANT    reading the logs…
   TOOL_RESULT  [FLATTENED id=… → sidecar]   ← one marker; fetch the full dump on demand
   ASSISTANT    "the OOM is at line 88,402 — the fix is …"
```

## What you'll actually save

Token reduction depends entirely on what the session did:

- **Read-heavy sessions** (lots of large files, logs, or screenshots in context) — expect reductions **up to ~50%**.
- **Prose-heavy sessions** (little external data ingested) — savings are negligible. There's simply not much bulk to move.
- It varies a lot — often a pleasant surprise, and once in a while a touch underwhelming.

**When to reach for it.** A common point is around **200k** tokens. For critical sessions where you want the model at its sharpest and most context-aware, flattening around **250k–300k** is where the most dramatic reductions tend to show up.

**Flatten smartly**, the same way you wouldn't compact mid-way through a large reading task. That said, nothing is ever lost — flattening everything and then cherry-picking the few blocks you still need is a perfectly legitimate strategy.

---

## Quick start

> Requires **Node.js ≥ 18** and **Claude Code**.

One command — installs from [npm](https://www.npmjs.com/package/flatten-mcp) and registers it user-wide:

```bash
claude mcp add flatten -s user -- npx -y flatten-mcp@latest
```

Or register it manually (in `~/.claude.json`, or your project's `.mcp.json`):

```json
{
  "mcpServers": {
    "flatten": {
      "command": "npx",
      "args": ["-y", "flatten-mcp@latest"]
    }
  }
}
```

Recommended — install the `/flatten` slash command:

```bash
curl -fsSL https://raw.githubusercontent.com/shayaShav/flatten-mcp/main/commands/flatten.md -o ~/.claude/commands/flatten.md
```

<details>
<summary><b>From source</b> (for development)</summary>

```bash
git clone https://github.com/shayaShav/flatten-mcp.git
cd flatten-mcp
npm install      # builds automatically via the "prepare" script
cp commands/flatten.md ~/.claude/commands/   # optional: installs the /flatten command
```

Register the local build instead:

```json
{
  "mcpServers": {
    "flatten": {
      "command": "node",
      "args": ["/absolute/path/to/flatten-mcp/dist/index.js"]
    }
  }
}
```

</details>

### Uninstall

```bash
claude mcp remove flatten -s user       # unregister the server
rm -f ~/.claude/commands/flatten.md     # remove the /flatten command, if installed
```

Flatten artifacts (`.flat.jsonl` sidecars, `.bak` backups) live next to your session files and are not deleted by uninstalling. To reclaim the disk, run `prune_flatten_artifacts` (with `include_sidecars: true`) **before** unregistering — or delete them manually from `~/.claude/projects/<encoded-project-dir>/`. Mind that flattened sessions need their sidecar for `retrieve_flattened` / `unflatten_session` — unflatten first if you want the bulk back inline.

### Configuration

By default the server operates on **the project the CLI runs in** (its current working directory). Pass `project_dir` explicitly on any call to target a different project.

| Env var | Required | Purpose |
| --- | --- | --- |
| `ANTHROPIC_API_KEY` | no | If set, token savings are counted **exactly** via Anthropic's free `count_tokens` endpoint instead of estimated locally. |
| `FLATTEN_COUNT_MODEL` | no | Model id used for the exact token count (default: `claude-haiku-4-5-20251001`). |

## Usage

> [!CAUTION]
> **Always exit the session you want to flatten with `Ctrl-C`, then flatten it from a *different* window.** Rewriting a live session's file out from under Claude Code corrupts its in-memory state and bricks the session.

1. **Exit the session you want to flatten** with `Ctrl-C`. This is mandatory — a 10-second live-write guard refuses to touch a recently-modified session unless you force it, but exiting is the safe path.
2. In a **new** Claude Code window, type `/flatten latest` or `/flatten <session-id>` — or ask:
   > "Flatten the latest session."  ·  or  ·  "Flatten session `<session-id>`."

   `/flatten latest` (or bare `/flatten`) flattens the **larger** of the two most recent sessions — the smaller, seconds-old one is almost always the window doing the flattening itself, and the session worth flattening is the big one. It never forces past the live-write guard.

3. **Resume** your original session and send a prompt. When Claude starts outputting text, you'll see the token count drop.

To preview without touching anything, ask for a **dry run** first. To undo, ask to **unflatten** the session — every original block is restored to its exact original value.

> [!TIP]
> Flattening needs no model intelligence — park a second window on a fast, inexpensive model (`/model haiku`) as a dedicated flattening station and just type `/flatten latest`.

### Validate the claims yourself

Every number flatten reports can be checked end to end in a couple of minutes:

1. Pick a meaty session — or make one: have Claude read a few large files, then exit with `Ctrl-C`.
2. In a new window, ask for a **dry run** — *"dry-run flatten the latest session"* — and read the report: `flattenedCount`, `contextTokensSaved` of `contextTokensTotal`, `diskBytesSaved`. Nothing has been written yet.
3. Run `/flatten latest` for real, `claude --resume` the original session, and send any prompt — the context indicator drops by roughly the reported amount (exactly, when `ANTHROPIC_API_KEY` is set).
4. Check reversibility: ask to **unflatten** the session, then diff the restored `.jsonl` against the `.jsonl.bak` backup created at flatten time — identical for Claude Code's canonical JSON.

## Tools

| Tool | What it does |
| --- | --- |
| `flatten_session` | Move bulky tool results into a sidecar, leaving `[FLATTENED …]` markers. Crash-safe and reversible. Supports `dry_run`, `min_size`, `force`, and `include_tool_use_result`. |
| `retrieve_flattened` | Fetch one original block back by its id — returns the original text, or re-renders a flattened screenshot as a real image. |
| `unflatten_session` | Reverse a flatten completely: re-inline every block from the sidecar, restoring each flattened result to its exact original value. |
| `prune_flatten_artifacts` | Reclaim disk by deleting

Topics

anthropicclaudeclaude-codecontext-windowmcpmcp-servermodel-context-protocoltokens

Frequently asked

What people ask about flatten-mcp

What is shayaShav/flatten-mcp?

shayaShav/flatten-mcp is mcp servers for the Claude AI ecosystem. Flatten Claude Code sessions: keep every prompt and event verbatim, resume at a lower token count. An MCP server. It has 0 GitHub stars and was last updated today.

How do I install flatten-mcp?

You can install flatten-mcp by cloning the repository (https://github.com/shayaShav/flatten-mcp) or following the README instructions on GitHub. ClaudeWave also provides quick install blocks on this page.

Is shayaShav/flatten-mcp safe to use?

shayaShav/flatten-mcp has not been audited yet by our security agent. Review the original repository on GitHub before using it in production.

Who maintains shayaShav/flatten-mcp?

shayaShav/flatten-mcp is maintained by shayaShav. The last recorded GitHub activity is from today, with 0 open issues.

Are there alternatives to flatten-mcp?

Yes. On ClaudeWave you can browse similar mcp servers at /categories/mcp, sorted by popularity or recent activity.

1-click deploy

Deploy flatten-mcp to your cloud

Ship this repo to production in minutes. Each platform spins up its own environment with editable env vars.

Vercel Railway Render

Embeddable badge

Maintain this repo? Add a badge to your README

Drop the badge into your GitHub README to show it's tracked on ClaudeWave. Each badge links back to this page and reflects the live Trust Score.

Markdown (README)

[![Featured on ClaudeWave](https://claudewave.com/api/badge/shayashav-flatten-mcp)](https://claudewave.com/repo/shayashav-flatten-mcp)

HTML

<a href="https://claudewave.com/repo/shayashav-flatten-mcp"><img src="https://claudewave.com/api/badge/shayashav-flatten-mcp" alt="Featured on ClaudeWave: shayaShav/flatten-mcp" width="320" height="64" /></a>

More MCP Servers

flatten-mcp alternatives

n8n-io

n8n

today

Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.

192.2k58.5kTypeScript

MCP ServersaiapisInstall

open-webui

today

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

141.3k20.3kPython

MCP ServersaillmInstall

google-gemini

gemini-cli

today

An open-source AI agent that brings the power of Gemini directly into your terminal.

105.2k14kTypeScript

MCP Serversaiai-agentsInstall

netdata

today

The fastest path to AI-powered full stack observability, even for lean teams.

79.1k6.4kC

MCP ServersaialertingInstall

D4Vinci

Scrapling

5d ago

🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!

63.3k6.2kPython

MCP Serversaiai-scrapingInstall

sansan0

TrendRadar

today

⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载，你的 AI 舆情监控助手与热点筛选工具！聚合多平台热点 + RSS 订阅，支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机，也支持接入 MCP 架构，赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ，数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。

59.4k24.6kPython

MCP ServersaibarkInstall