Skip to main content
ClaudeWave

Flatten Claude Code sessions: keep every prompt and event verbatim, resume at a lower token count. An MCP server.

MCP ServersOfficial Registry0 stars0 forksTypeScriptMITUpdated today
Install in Claude Code / Claude Desktop
Method: NPX · flatten-mcp
Claude Code CLI
claude mcp add flatten-mcp -- npx -y flatten-mcp
claude_desktop_config.json (Claude Desktop)
{
  "mcpServers": {
    "flatten-mcp": {
      "command": "npx",
      "args": ["-y", "flatten-mcp"]
    }
  }
}
1. Run the command above in your terminal (Claude Code), or paste the JSON config into claude_desktop_config.json (Claude Desktop).
2. Replace any <placeholder> values with your API keys or paths.
3. Restart Claude. The MCP server and its tools appear automatically.
Use cases

MCP Servers overview

<p align="center">
    <img src="https://raw.githubusercontent.com/shayaShav/flatten-mcp/main/assets/logo.png" alt="flatten-mcp logo" width="160">
</p>

# flatten-mcp

> Resume the **exact same conversation** at a lower token cost — without compacting it into a lossy summary.

<p align="left">
  <a href="https://www.npmjs.com/package/flatten-mcp"><img alt="npm version" src="https://img.shields.io/npm/v/flatten-mcp.svg"></a>
  <a href="https://www.npmjs.com/package/flatten-mcp"><img alt="npm downloads" src="https://img.shields.io/npm/dm/flatten-mcp.svg"></a>
  <a href="https://github.com/shayaShav/flatten-mcp/blob/main/LICENSE"><img alt="License: MIT" src="https://img.shields.io/badge/License-MIT-green.svg"></a>
  <a href="https://nodejs.org"><img alt="Node &gt;= 18" src="https://img.shields.io/badge/node-%3E%3D18-339933?logo=nodedotjs&amp;logoColor=white"></a>
  <a href="https://modelcontextprotocol.io"><img alt="Model Context Protocol" src="https://img.shields.io/badge/MCP-server-6E56CF.svg"></a>
  <a href="https://docs.claude.com/en/docs/claude-code"><img alt="Built for Claude Code" src="https://img.shields.io/badge/built%20for-Claude%20Code-D97757.svg"></a>
  <a href="https://smithery.ai/server/@shaya-shaviv/flatten-mcp"><img alt="Smithery calls" src="https://smithery.ai/badge/@shaya-shaviv/flatten-mcp"></a>
</p>

**flatten-mcp** is a [Model Context Protocol](https://modelcontextprotocol.io) server for [Claude Code](https://docs.claude.com/en/docs/claude-code). It shrinks a session's token footprint by moving bulky tool output (large file reads, command logs, base64 screenshots) out of the conversation and into a sidecar file — leaving a tiny, retrievable reference in its place. Your prompts and the chronological flow of the session are preserved **verbatim** — those lines are never rewritten. You resume the same raw conversation; it just costs less to carry.

See how 317,236 tokens turned into 182,287:

https://github.com/user-attachments/assets/4672b3cd-f78f-4146-97ba-e0077b655381

---

## Why flatten instead of compact?

The standard answer to a full context window is **compaction**: the model reads the whole conversation and rewrites it into a shorter summary. That summary is lossy by construction — an *interpretation* of your history, and interpretations drift, smooth over the awkward parts, and quietly drop the detail you didn't know you'd need. But the history is exactly what's worth keeping verbatim: the words you typed at 2 a.m., the precise order of events, the dead ends and the decisions. A fuzzy, half-formed prompt carries more raw truth about your intent than any tidy paragraph written *about* it after the fact — and preserving it untouched is the foundation of trust in a coding agent.

**Flattening is the opposite move.** It changes *nothing* about what was said. In most sessions the model reads a lot — large files, long logs, multiple sources — and keeps every byte of it in context, even though it has nearly always already **written down the conclusion in plain prose**: the one line that mattered in a 2 MB log, the finding distilled from five files, the running tally of open tasks. The raw source has done its job. Flattening lifts those already-summarized blocks out and swaps each for a lightweight reference ID — so starting cold from a flattened session is usually smooth sailing, and on the rare occasion the raw bytes *are* needed, they're one `retrieve_flattened` call away.

```
What sits in the context window:

   USER         "fix the crash"
   ASSISTANT    reading the logs…
   TOOL_RESULT  ▓▓▓ 2 MB log dump ▓▓▓        ← bulk; already summarized in prose below
   ASSISTANT    "the OOM is at line 88,402 — the fix is …"

After flatten — same words, only the bulk set aside:

   USER         "fix the crash"
   ASSISTANT    reading the logs…
   TOOL_RESULT  [FLATTENED id=… → sidecar]   ← one marker; fetch the full dump on demand
   ASSISTANT    "the OOM is at line 88,402 — the fix is …"
```

## What you'll actually save

Token reduction depends entirely on what the session did:

- **Read-heavy sessions** (lots of large files, logs, or screenshots in context) — expect reductions **up to ~50%**.
- **Prose-heavy sessions** (little external data ingested) — savings are negligible. There's simply not much bulk to move.
- It varies a lot — often a pleasant surprise, and once in a while a touch underwhelming.

**When to reach for it.** A common point is around **200k** tokens. For critical sessions where you want the model at its sharpest and most context-aware, flattening around **250k–300k** is where the most dramatic reductions tend to show up.

**Flatten smartly**, the same way you wouldn't compact mid-way through a large reading task. That said, nothing is ever lost — flattening everything and then cherry-picking the few blocks you still need is a perfectly legitimate strategy.

---

## Quick start

> Requires **Node.js ≥ 18** and **Claude Code**.

One command — installs from [npm](https://www.npmjs.com/package/flatten-mcp) and registers it user-wide:

```bash
claude mcp add flatten -s user -- npx -y flatten-mcp@latest
```

Or register it manually (in `~/.claude.json`, or your project's `.mcp.json`):

```json
{
  "mcpServers": {
    "flatten": {
      "command": "npx",
      "args": ["-y", "flatten-mcp@latest"]
    }
  }
}
```

Recommended — install the `/flatten` slash command:

```bash
curl -fsSL https://raw.githubusercontent.com/shayaShav/flatten-mcp/main/commands/flatten.md -o ~/.claude/commands/flatten.md
```

<details>
<summary><b>From source</b> (for development)</summary>

```bash
git clone https://github.com/shayaShav/flatten-mcp.git
cd flatten-mcp
npm install      # builds automatically via the "prepare" script
cp commands/flatten.md ~/.claude/commands/   # optional: installs the /flatten command
```

Register the local build instead:

```json
{
  "mcpServers": {
    "flatten": {
      "command": "node",
      "args": ["/absolute/path/to/flatten-mcp/dist/index.js"]
    }
  }
}
```

</details>

### Uninstall

```bash
claude mcp remove flatten -s user       # unregister the server
rm -f ~/.claude/commands/flatten.md     # remove the /flatten command, if installed
```

Flatten artifacts (`.flat.jsonl` sidecars, `.bak` backups) live next to your session files and are not deleted by uninstalling. To reclaim the disk, run `prune_flatten_artifacts` (with `include_sidecars: true`) **before** unregistering — or delete them manually from `~/.claude/projects/<encoded-project-dir>/`. Mind that flattened sessions need their sidecar for `retrieve_flattened` / `unflatten_session` — unflatten first if you want the bulk back inline.

### Configuration

By default the server operates on **the project the CLI runs in** (its current working directory). Pass `project_dir` explicitly on any call to target a different project.

| Env var | Required | Purpose |
| --- | --- | --- |
| `ANTHROPIC_API_KEY` | no | If set, token savings are counted **exactly** via Anthropic's free `count_tokens` endpoint instead of estimated locally. |
| `FLATTEN_COUNT_MODEL` | no | Model id used for the exact token count (default: `claude-haiku-4-5-20251001`). |

## Usage

> [!CAUTION]
> **Always exit the session you want to flatten with `Ctrl-C`, then flatten it from a *different* window.** Rewriting a live session's file out from under Claude Code corrupts its in-memory state and bricks the session.

1. **Exit the session you want to flatten** with `Ctrl-C`. This is mandatory — a 10-second live-write guard refuses to touch a recently-modified session unless you force it, but exiting is the safe path.
2. In a **new** Claude Code window, type `/flatten latest` or `/flatten <session-id>` — or ask:
   > "Flatten the latest session."  ·  or  ·  "Flatten session `<session-id>`."

   `/flatten latest` (or bare `/flatten`) flattens the **larger** of the two most recent sessions — the smaller, seconds-old one is almost always the window doing the flattening itself, and the session worth flattening is the big one. It never forces past the live-write guard.

3. **Resume** your original session and send a prompt. When Claude starts outputting text, you'll see the token count drop.

To preview without touching anything, ask for a **dry run** first. To undo, ask to **unflatten** the session — every original block is restored to its exact original value.

> [!TIP]
> Flattening needs no model intelligence — park a second window on a fast, inexpensive model (`/model haiku`) as a dedicated flattening station and just type `/flatten latest`.

### Validate the claims yourself

Every number flatten reports can be checked end to end in a couple of minutes:

1. Pick a meaty session — or make one: have Claude read a few large files, then exit with `Ctrl-C`.
2. In a new window, ask for a **dry run** — *"dry-run flatten the latest session"* — and read the report: `flattenedCount`, `contextTokensSaved` of `contextTokensTotal`, `diskBytesSaved`. Nothing has been written yet.
3. Run `/flatten latest` for real, `claude --resume` the original session, and send any prompt — the context indicator drops by roughly the reported amount (exactly, when `ANTHROPIC_API_KEY` is set).
4. Check reversibility: ask to **unflatten** the session, then diff the restored `.jsonl` against the `.jsonl.bak` backup created at flatten time — identical for Claude Code's canonical JSON.

## Tools

| Tool | What it does |
| --- | --- |
| `flatten_session` | Move bulky tool results into a sidecar, leaving `[FLATTENED …]` markers. Crash-safe and reversible. Supports `dry_run`, `min_size`, `force`, and `include_tool_use_result`. |
| `retrieve_flattened` | Fetch one original block back by its id — returns the original text, or re-renders a flattened screenshot as a real image. |
| `unflatten_session` | Reverse a flatten completely: re-inline every block from the sidecar, restoring each flattened result to its exact original value. |
| `prune_flatten_artifacts` | Reclaim disk by deleting 
anthropicclaudeclaude-codecontext-windowmcpmcp-servermodel-context-protocoltokens

What people ask about flatten-mcp

What is shayaShav/flatten-mcp?

+

shayaShav/flatten-mcp is mcp servers for the Claude AI ecosystem. Flatten Claude Code sessions: keep every prompt and event verbatim, resume at a lower token count. An MCP server. It has 0 GitHub stars and was last updated today.

How do I install flatten-mcp?

+

You can install flatten-mcp by cloning the repository (https://github.com/shayaShav/flatten-mcp) or following the README instructions on GitHub. ClaudeWave also provides quick install blocks on this page.

Is shayaShav/flatten-mcp safe to use?

+

shayaShav/flatten-mcp has not been audited yet by our security agent. Review the original repository on GitHub before using it in production.

Who maintains shayaShav/flatten-mcp?

+

shayaShav/flatten-mcp is maintained by shayaShav. The last recorded GitHub activity is from today, with 0 open issues.

Are there alternatives to flatten-mcp?

+

Yes. On ClaudeWave you can browse similar mcp servers at /categories/mcp, sorted by popularity or recent activity.

Deploy flatten-mcp to your cloud

Ship this repo to production in minutes. Each platform spins up its own environment with editable env vars.

Maintain this repo? Add a badge to your README

Drop the badge into your GitHub README to show it's tracked on ClaudeWave. Each badge links back to this page and reflects the live Trust Score.

Featured on ClaudeWave: shayaShav/flatten-mcp
[![Featured on ClaudeWave](https://claudewave.com/api/badge/shayashav-flatten-mcp)](https://claudewave.com/repo/shayashav-flatten-mcp)
<a href="https://claudewave.com/repo/shayashav-flatten-mcp"><img src="https://claudewave.com/api/badge/shayashav-flatten-mcp" alt="Featured on ClaudeWave: shayaShav/flatten-mcp" width="320" height="64" /></a>

More MCP Servers

flatten-mcp alternatives