The highest-scoring AI memory system ever benchmarked. And it's free.
MCP Servers45.6k stars5.9k forks● PythonMITUpdated today
ClaudeWave Trust Score
77/100
Raw-storage AI memory system using ChromaDB with hierarchical Palace metaphor for retrieval.
Passed
- ✓Open-source license (MIT)
- ✓Actively maintained (<30d)
- ✓Healthy fork ratio
- ✓Clear description
- ✓Topics declared
- ✓Documented (README)
Flags
- !Brand-new repo with thousands of stars (suspicious)
- !README contains suspicious pattern: eval\s*\(
Use with caution
Last scanned: 4/14/2026
Install in Claude Desktop
Method detected: pip / Python · mempalace
{
"mcpServers": {
"mempalace": {
"command": "python",
"args": ["-m", "mempalace.mcp_server"]
}
}
}1. Copy the snippet above.
2. Paste into
~/Library/Application Support/Claude/claude_desktop_config.json (Mac) or %APPDATA%\Claude\claude_desktop_config.json (Windows).3. Replace any
<placeholder> values with your API keys or paths.4. Restart Claude Desktop. The MCP server appears automatically.
💡 Install first: pip install mempalace
Use cases
🧠 AI / ML🛠️ Dev Tools🗄️ Data & DB
About
MCP Servers overview
<div align="center">
<img src="assets/mempalace_logo.png" alt="MemPalace" width="280">
# MemPalace
### The highest-scoring AI memory system ever benchmarked. And it's free.
<br>
Every conversation you have with an AI — every decision, every debugging session, every architecture debate — disappears when the session ends. Six months of work, gone. You start over every time.
Other memory systems try to fix this by letting AI decide what's worth remembering. It extracts "user prefers Postgres" and throws away the conversation where you explained *why*. MemPalace takes a different approach: **store everything, then make it findable.**
**The Palace** — Ancient Greek orators memorized entire speeches by placing ideas in rooms of an imaginary building. Walk through the building, find the idea. MemPalace applies the same principle to AI memory: your conversations are organized into wings (people and projects), halls (types of memory), and rooms (specific ideas). No AI decides what matters — you keep every word, and the structure gives you a navigable map instead of a flat search index.
**Raw verbatim storage** — MemPalace stores your actual exchanges in ChromaDB without summarization or extraction. The 96.6% LongMemEval result comes from this raw mode. We don't burn an LLM to decide what's "worth remembering" — we keep everything and let semantic search find it.
**AAAK (experimental)** — A lossy abbreviation dialect for packing repeated entities into fewer tokens at scale. Readable by any LLM that reads text — Claude, GPT, Gemini, Llama, Mistral — no decoder needed. **AAAK is a separate compression layer, not the storage default**, and on the LongMemEval benchmark it currently regresses vs raw mode (84.2% vs 96.6%). We're iterating. See the [note above](#a-note-from-milla--ben--april-7-2026) for the honest status.
**Local, open, adaptable** — MemPalace runs entirely on your machine, on any data you have locally, without using any external API or services. It has been tested on conversations — but it can be adapted for different types of datastores. This is why we're open-sourcing it.
<br>
[![][version-shield]][release-link]
[![][python-shield]][python-link]
[![][license-shield]][license-link]
[![][discord-shield]][discord-link]
<br>
[Quick Start](#quick-start) · [The Palace](#the-palace) · [AAAK Dialect](#aaak-dialect-experimental) · [Benchmarks](#benchmarks) · [MCP Tools](#mcp-server)
<br>
### Highest LongMemEval score ever published — free or paid.
<table>
<tr>
<td align="center"><strong>96.6%</strong><br><sub>LongMemEval R@5<br><b>raw mode</b>, zero API calls</sub></td>
<td align="center"><strong>500/500</strong><br><sub>questions tested<br>independently reproduced</sub></td>
<td align="center"><strong>$0</strong><br><sub>No subscription<br>No cloud. Local only.</sub></td>
</tr>
</table>
<sub>Reproducible — runners in <a href="benchmarks/">benchmarks/</a>. <a href="benchmarks/BENCHMARKS.md">Full results</a>. The 96.6% is from <b>raw verbatim mode</b>, not AAAK or rooms mode (those score lower — see <a href="#a-note-from-milla--ben--april-7-2026">note above</a>).</sub>
</div>
---
## A Note from Milla & Ben — April 7, 2026
> The community caught real problems in this README within hours of launch and we want to address them directly.
>
> **What we got wrong:**
>
> - **The AAAK token example was incorrect.** We used a rough heuristic (`len(text)//3`) for token counts instead of an actual tokenizer. Real counts via OpenAI's tokenizer: the English example is 66 tokens, the AAAK example is 73. AAAK does not save tokens at small scales — it's designed for *repeated entities at scale*, and the README example was a bad demonstration of that. We're rewriting it.
>
> - **"30x lossless compression" was overstated.** AAAK is a lossy abbreviation system (entity codes, sentence truncation). Independent benchmarks show AAAK mode scores **84.2% R@5 vs raw mode's 96.6%** on LongMemEval — a 12.4 point regression. The honest framing is: AAAK is an experimental compression layer that trades fidelity for token density, and **the 96.6% headline number is from RAW mode, not AAAK**.
>
> - **"+34% palace boost" was misleading.** That number compares unfiltered search to wing+room metadata filtering. Metadata filtering is a standard ChromaDB feature, not a novel retrieval mechanism. Real and useful, but not a moat.
>
> - **"Contradiction detection"** exists as a separate utility (`fact_checker.py`) but is not currently wired into the knowledge graph operations as the README implied.
>
> - **"100% with Haiku rerank"** is real (we have the result files) but the rerank pipeline is not in the public benchmark scripts. We're adding it.
>
> **What's still true and reproducible:**
>
> - **96.6% R@5 on LongMemEval in raw mode**, on 500 questions, zero API calls — independently reproduced on M2 Ultra in under 5 minutes by [@gizmax](https://github.com/MemPalace/mempalace/issues/39).
> - Local, free, no subscription, no cloud, no data leaving your machine.
> - The architecture (wings, rooms, closets, drawers) is real and useful, even if it's not a magical retrieval boost.
>
> **What we're doing:**
>
> 1. Rewriting the AAAK example with real tokenizer counts and a scenario where AAAK actually demonstrates compression
> 2. Adding `mode raw / aaak / rooms` clearly to the benchmark documentation so the trade-offs are visible
> 3. Wiring `fact_checker.py` into the KG ops so the contradiction detection claim becomes true
> 4. Pinning ChromaDB to a tested range (Issue #100), fixing the shell injection in hooks (#110), and addressing the macOS ARM64 segfault (#74)
>
> **Thank you to everyone who poked holes in this.** Brutal honest criticism is exactly what makes open source work, and it's what we asked for. Special thanks to [@panuhorsmalahti](https://github.com/MemPalace/mempalace/issues/43), [@lhl](https://github.com/MemPalace/mempalace/issues/27), [@gizmax](https://github.com/MemPalace/mempalace/issues/39), and everyone who filed an issue or a PR in the first 48 hours. We're listening, we're fixing, and we'd rather be right than impressive.
>
> — *Milla Jovovich & Ben Sigman*
---
## An important follow up note regarding fake MemPalace websites - April 11, 2026
Several Community Members (#267, #326, #506) have pointed out there are fake MemPalace websites popping up, including ones with Malware.
To be super clear, MemPalace *has no website* (at least for now), so anything claiming to be one is false.
Thanks to our Community Members for letting us know about the problem.
Stay safe out there.
---
## Quick Start
```bash
pip install mempalace
# Set up your world — who you work with, what your projects are
mempalace init ~/projects/myapp
# Mine your data
mempalace mine ~/projects/myapp # projects — code, docs, notes
mempalace mine ~/chats/ --mode convos # convos — Claude, ChatGPT, Slack exports
mempalace mine ~/chats/ --mode convos --extract general # general — classifies into decisions, milestones, problems
# Search anything you've ever discussed
mempalace search "why did we switch to GraphQL"
# Your AI remembers
mempalace status
```
Three mining modes: **projects** (code and docs), **convos** (conversation exports), and **general** (auto-classifies into decisions, preferences, milestones, problems, and emotional context). Everything stays on your machine.
---
## How You Actually Use It
After the one-time setup (install → init → mine), you don't run MemPalace commands manually. Your AI uses it for you. There are two ways, depending on which AI you use.
### With Claude Code (recommended)
Native marketplace install:
```bash
claude plugin marketplace add MemPalace/mempalace
claude plugin install --scope user mempalace
```
Restart Claude Code, then type `/skills` to verify "mempalace" appears.
### With Claude, ChatGPT, Cursor, Gemini (MCP-compatible tools)
```bash
# Connect MemPalace once
claude mcp add mempalace -- python -m mempalace.mcp_server
```
Now your AI has 29 tools available through MCP. Ask it anything:
> *"What did we decide about auth last month?"*
Claude calls `mempalace_search` automatically, gets verbatim results, and answers you. You never type `mempalace search` again. The AI handles it.
MemPalace also works natively with **Gemini CLI** (which handles the server and save hooks automatically) — see the [Gemini CLI Integration Guide](examples/gemini_cli_setup.md).
### With local models (Llama, Mistral, or any offline LLM)
Local models generally don't speak MCP yet. Two approaches:
**1. Wake-up command** — load your world into the model's context:
```bash
mempalace wake-up > context.txt
# Paste context.txt into your local model's system prompt
```
This gives your local model ~600-900 tokens of critical facts (in AAAK if you prefer) before you ask a single question.
**2. CLI search** — query on demand, feed results into your prompt:
```bash
mempalace search "auth decisions" > results.txt
# Include results.txt in your prompt
```
Or use the Python API:
```python
from mempalace.searcher import search_memories
results = search_memories("auth decisions", palace_path="~/.mempalace/palace")
# Inject into your local model's context
```
Either way — your entire memory stack runs offline. ChromaDB on your machine, Llama on your machine, AAAK for compression, zero cloud calls.
---
## The Problem
Decisions happen in conversations now. Not in docs. Not in Jira. In conversations with Claude, ChatGPT, Copilot. The reasoning, the tradeoffs, the "we tried X and it failed because Y" — all trapped in chat windows that evaporate when the session ends.
**Six months of daily AI use = 19.5 million tokens.** That's every decision, every debugging session, every architecture debate. Gone.
| Approach | Tokens loaded | Annual cost |
|----------|--------------|-------------|
| Paste everything | 19.5M — doesn't fit any context window | Impossible |
| LLM summaries | ~650K | ~$507/yr Topics
aichromadbllmmcpmemorypython
Related
More MCP Servers
n8n-io
n8n
✓95
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
184k56.8kTypeScript· today
MCP Serversaiapis
open-webui
open-webui
✓89
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
131.8k18.7kPython· today
MCP Serversaillm
google-gemini
gemini-cli
✓98
An open-source AI agent that brings the power of Gemini directly into your terminal.
101.2k13.1kTypeScript· today
MCP Serversaiai-agents
punkpeye
awesome-mcp-servers
✓87
A collection of MCP servers.
84.8k9.1k· today
MCP Serversaimcp
netdata
netdata
✓97
The fastest path to AI-powered full stack observability, even for lean teams.
78.4k6.4kC· today
MCP Serversaialerting
Mintplex-Labs
anything-llm
✓93
The all-in-one AI productivity accelerator. On device and privacy first with no annoying setup or configuration.
58.3k6.3kJavaScript· today
MCP Serversai-agentscustom-ai-agents