Skip to main content
ClaudeWave

Local-first MCP memory for AI agents — the searchable corpus (library). Pairs with mcp-memory-rs for curated state.

MCP ServersOfficial Registry0 stars0 forksRustApache-2.0Updated today
Install in Claude Code / Claude Desktop
Method: Manual · mcp-vl-msa-rs
Claude Code CLI
git clone https://github.com/DioNanos/mcp-vl-msa-rs
claude_desktop_config.json (Claude Desktop)
{
  "mcpServers": {
    "mcp-vl-msa-rs": {
      "command": "mcp-vl-msa-rs"
    }
  }
}
1. Run the command above in your terminal (Claude Code), or paste the JSON config into claude_desktop_config.json (Claude Desktop).
2. Replace any <placeholder> values with your API keys or paths.
3. Restart Claude. The MCP server and its tools appear automatically.
💡 Install the binary first: cargo install mcp-vl-msa-rs (or build from https://github.com/DioNanos/mcp-vl-msa-rs).
Use cases

MCP Servers overview

# mcp-vl-msa-rs

A searchable long-term memory for AI agents, exposed as an MCP stdio server.
Index documents, notes and past conversations into collections; retrieve the
top-k relevant chunks for a query and inject the original text back to the
model; add or drop agent memories with `msa_remember` / `msa_forget`. Pure
Rust, BM25 over [tantivy](https://github.com/quickwit-oss/tantivy), zero ML
deps in the default build; optional in-process dense rerank.

Any MCP client (Claude Code, Codex, or anything speaking MCP stdio) gets the
same memory: a queryable corpus that survives across sessions and model swaps,
with no cloud account and no embedding service required. Use it to give an
agent durable recall over a knowledge base, a docs tree, or its own chat
history — retrieval that returns the original text, not just embeddings.

It is one half of a two-part memory: this server is the **library** (corpus
recall), its companion [mcp-memory-rs](https://github.com/DioNanos/mcp-memory-rs)
is the **notebook** (curated state). An agent that swaps models loses neither.

```mermaid
flowchart LR
    A["AI agent<br/>(any MCP client)"]
    A -->|"curated state<br/>read / write / sync"| M["mcp-memory-rs<br/><i>the notebook</i>"]
    A -->|"corpus recall<br/>index / search / fetch"| V["mcp-vl-msa-rs<br/><i>the library</i>"]
    M --- D1[("JSON categories<br/>SQLite FTS5")]
    V --- D2[("tantivy BM25<br/>collections")]
```

**The name**: `msa` is the retrieval pattern it borrows from the Memory Sparse
Attention paper (arXiv:2603.23516) — an *extrinsic* approximation, not the
neural model; distinct from MiniMax's MSA-architecture LLMs, which are
*intrinsic* (in-model) generators. `vl` is for Vivling (`codex-vl`), its first
adopter — but the server is fully AI-agnostic and depends on nothing from it.

**Status**: v0.4 — hybrid sparse+dense optional.

## Why

The original [Memory Sparse Attention](https://arxiv.org/abs/2603.23516) paper (EverMind-AI) describes an end-to-end trainable sparse attention layer over chunk-pooled KV caches. That is a neural artifact and is not portable to a pure-Rust MCP server. What *is* portable, and what this repo aims to deliver, is the MSA macro pattern:

1. **Chunked storage** of long-form text with a small fixed pool size (`P=64` words by default, mirroring the paper).
2. **Top-k sparse routing** over chunks (BM25 surrogate; learned routing is out of scope).
3. **Original text injection** (paper §4.3, ablation -37.1% without): `msa_search` returns chunks, `msa_fetch_doc` returns the full document.
4. **Memory Interleave** as a *protocol* (planned v0.4): the AI client orchestrates multi-hop retrieval through repeated tool calls with a server-side cursor.

Design and rationale are documented in the project notes (negative results, gate methodology); see [`docs/NEGATIVE_RESULTS.md`](docs/NEGATIVE_RESULTS.md).

## Tool surface

| Tool | Since | Description |
|---|---|---|
| `msa_index` | v0.1 | Index a document; existing chunks for `doc_id` are replaced. |
| `msa_search` | v0.1 | Top-k chunks, score normalized 0.0–1.0. |
| `msa_fetch_doc` | v0.1 | Full original text of a document. |
| `msa_delete` | v0.1 | Remove a document and all its chunks. |
| `msa_list_collections` | v0.1 | Collections open in the registry. |
| `msa_stats` | v0.1 | Per-collection statistics (exact `num_documents` / `total_tokens`). |
| `SearchFilter` | v0.2 | Metadata filter (`where_eq`/`where_in`/`created_*`), post-retrieval. |
| `msa_search_iterative` | v0.3 | Memory Interleave with server-side cursor; dedups across rounds. |
| `msa_drop_session` | v0.3 | Force-evict a Memory Interleave session before TTL. |
| `dense_alpha` on `msa_search` | v0.4 | Hybrid BM25 + cosine rerank. Requires `--features embeddings` + `[embeddings]` config. |
| `msa_remember` / `msa_forget` | v0.4 | Agent-memory surface: enrich + low-signal gate + content-hash dedup; standard metadata (`kind` / `source_id` / `created_at`). |
| `msa_sync_path` | v0.4 | Mirror a directory into a collection (filesystem source; blake3 delta sync). |

## Install

**Prebuilt binary** (recommended) — download the archive for your platform from
the [latest release](https://github.com/DioNanos/mcp-vl-msa-rs/releases/latest),
extract, and point your MCP client at the binary:

```bash
tar xzf mcp-vl-msa-rs-x86_64-unknown-linux-gnu.tar.gz
install -m755 mcp-vl-msa-rs-*/mcp-vl-msa-rs ~/.local/bin/
```

Prebuilt targets (Linux + Android): `x86_64-unknown-linux-gnu`,
`x86_64-unknown-linux-musl`, `aarch64-unknown-linux-gnu`,
`aarch64-unknown-linux-musl` (edge / ARM / Termux), `aarch64-linux-android`.

**macOS**: no prebuilt binary is shipped (it would need Apple code-signing).
Install from source instead — `cargo install` below compiles it on your Mac in
one command, no signing needed.

**From source** (Rust toolchain) — `--locked` is required (the workspace
`Cargo.lock` pins a working `time` / `tantivy-common` resolution; a fresh
resolve breaks the build), and `mcp-msa-server` is the package name (the
binary it installs is `mcp-vl-msa-rs`):

```bash
cargo install --git https://github.com/DioNanos/mcp-vl-msa-rs \
  --locked --features source-fs mcp-msa-server
```

## Build & test

```bash
cd mcp-vl-msa-rs

# Default: pure BM25, zero network deps
cargo build --release
cargo test

# Hybrid sparse + dense (in-process Candle rerank, no external service)
cargo build --release --features embeddings
cargo test  --features embeddings
```

### Hybrid mode config

Add `[embeddings]` to `MCP_MSA_CONFIG` to activate dense rerank. Without
this section the server stays in BM25-only mode even when the binary was
built with `--features embeddings`.

The production backend is `candle-modernbert`: the encoder runs **in-process**
(Candle), offline-deterministic, from a local model bundle — no daemon, no
network at runtime, no automatic downloads. Prepare the bundle once with
`scripts/prepare-granite-r2-97m.sh`.

```toml
[storage]
storage_dir = "~/.local/state/mcp-vl-msa-rs"

[chunking]
chunk_size = 64
overlap = 0

[embeddings]
backend   = "candle-modernbert"
model_dir = "~/.local/share/mcp-vl-msa-rs/models/granite-r2-97m"
dim       = 768
model_id  = "granite-r2-97m"
```

A transitional `backend = "ollama"` (HTTP to an Ollama-compatible service)
still exists but is **deprecated and scheduled for removal in v0.6** — do not
build new setups on it.

The AI client opts into hybrid scoring per-call by passing `dense_alpha`
to `msa_search` (or any future tool that supports it). `dense_alpha = 1.0`
(default) is BM25-only; `0.0` is dense-only; intermediate values are a
linear blend `α·bm25 + (1-α)·((cos+1)/2)`. Cosine is shifted to `[0,1]`
so it composes linearly with the already max-normalized BM25 score.

## Run as MCP stdio

```bash
# Default storage: ~/.local/state/mcp-vl-msa-rs/
./target/release/mcp-vl-msa-rs

# With explicit config
MCP_VL_MSA_CONFIG=~/.config/mcp-vl-msa-rs/config.toml \
MCP_DEVICE=my-node \
./target/release/mcp-vl-msa-rs
```

Example `~/.codex/config.toml` entry:

```toml
[mcp_servers.vl_msa]
command = "/path/to/mcp-vl-msa-rs/target/release/mcp-vl-msa-rs"
env = { MCP_DEVICE = "my-node" }
# let the model call tools without a per-call approval prompt
default_tools_approval_mode = "approve"
```

Equivalent `~/.claude.json` entry for Claude Code:

```json
{
  "mcpServers": {
    "vl_msa": {
      "command": "/path/to/mcp-vl-msa-rs/target/release/mcp-vl-msa-rs",
      "env": { "MCP_DEVICE": "my-node" }
    }
  }
}
```

### AI client compatibility

- Clients with partial MCP support may not surface the server's `instructions`
  text. The tool descriptions and request-field descriptions are self-contained,
  so a model can work from those alone.
- Read-only tools (`msa_search`, `msa_fetch_doc`, `msa_stats`,
  `msa_list_collections`, `msa_manifest`, `msa_search_iterative`,
  `msa_interleave_round`) carry the `readOnlyHint` annotation, which lets a
  gating client auto-approve them.
- If a model reports an "unsupported call" or "user cancelled" on codex, that is
  the approval gate, not a server fault — set `default_tools_approval_mode`
  (above) so tool calls are not blocked on a prompt.

## Storage layout

```
~/.local/state/mcp-vl-msa-rs/
├── <collection_a>/        ← tantivy index directory
├── <collection_b>/
└── ...
```

Each collection is an independent tantivy index. Collection names are validated
(rejected if they contain path separators, `..`, etc.) so a collection cannot
escape the root.

## Roadmap

Shipped:

- **v0.2** — `SearchFilter` (where_eq / where_in / created range), post-retrieval.
- **v0.3** — `msa_search_iterative` Memory Interleave with server-side cursor + TTL'd `MsaSession` registry.
- **v0.4** — hybrid BM25 + dense rerank behind feature flag `embeddings`, Ollama backend, per-call `dense_alpha`; agent-memory surface (`msa_remember` / `msa_forget`); filesystem source metadata (`created_at` / `source` / `ext` / `dir`) at index time; exact `num_documents` / `total_tokens` in `msa_stats`; `msa-bench` reproducible benchmark crate; prebuilt-binary packaging.

Next (not yet built):

- Query-time tantivy filter (today `SearchFilter` runs post-retrieval; fine for
  normal corpora, but a pre-filter would help when selectivity is high on a very
  large index).
- ACL for multi-tenant collections.
- Tool-description tuning.

## Related work

- **MSA paper** ([arXiv:2603.23516](https://arxiv.org/abs/2603.23516)) — the
  architectural inspiration (neural, intrinsic); this repo is an extrinsic,
  pure-Rust approximation of the macro pattern.
- **Vivling** (in `codex-vl`) — the first downstream consumer: this server is
  its long-term memory.
- **[mcp-memory-rs](https://github.com/DioNanos/mcp-memory-rs)** — the companion
  server for *curated* agent state (named JSON categories, per-device ACL,
  fleet sync). This server does corpus recall; together they cover both halves
  of agent memory: the curated notebook and the queryable library.

## License

Apache-2.0. See [LICENSE](./LICENSE).
ai-agentsbm25local-firstmcpmemoryragretrievalrusttantivy

What people ask about mcp-vl-msa-rs

What is DioNanos/mcp-vl-msa-rs?

+

DioNanos/mcp-vl-msa-rs is mcp servers for the Claude AI ecosystem. Local-first MCP memory for AI agents — the searchable corpus (library). Pairs with mcp-memory-rs for curated state. It has 0 GitHub stars and was last updated today.

How do I install mcp-vl-msa-rs?

+

You can install mcp-vl-msa-rs by cloning the repository (https://github.com/DioNanos/mcp-vl-msa-rs) or following the README instructions on GitHub. ClaudeWave also provides quick install blocks on this page.

Is DioNanos/mcp-vl-msa-rs safe to use?

+

DioNanos/mcp-vl-msa-rs has not been audited yet by our security agent. Review the original repository on GitHub before using it in production.

Who maintains DioNanos/mcp-vl-msa-rs?

+

DioNanos/mcp-vl-msa-rs is maintained by DioNanos. The last recorded GitHub activity is from today, with 0 open issues.

Are there alternatives to mcp-vl-msa-rs?

+

Yes. On ClaudeWave you can browse similar mcp servers at /categories/mcp, sorted by popularity or recent activity.

Deploy mcp-vl-msa-rs to your cloud

Ship this repo to production in minutes. Each platform spins up its own environment with editable env vars.

Maintain this repo? Add a badge to your README

Drop the badge into your GitHub README to show it's tracked on ClaudeWave. Each badge links back to this page and reflects the live Trust Score.

Featured on ClaudeWave: DioNanos/mcp-vl-msa-rs
[![Featured on ClaudeWave](https://claudewave.com/api/badge/dionanos-mcp-vl-msa-rs)](https://claudewave.com/repo/dionanos-mcp-vl-msa-rs)
<a href="https://claudewave.com/repo/dionanos-mcp-vl-msa-rs"><img src="https://claudewave.com/api/badge/dionanos-mcp-vl-msa-rs" alt="Featured on ClaudeWave: DioNanos/mcp-vl-msa-rs" width="320" height="64" /></a>

More MCP Servers

mcp-vl-msa-rs alternatives