mcp-vl-msa-rs

Name: DioNanos/mcp-vl-msa-rs
Author: DioNanos

Local-first MCP memory for AI agents — the searchable corpus (library). Pairs with mcp-memory-rs for curated state.

MCP ServersOfficial Registry0 stars0 forks● RustApache-2.0Updated today

Install in Claude Code / Claude Desktop

Method: Manual · mcp-vl-msa-rs

Claude Code CLI

git clone https://github.com/DioNanos/mcp-vl-msa-rs

claude_desktop_config.json (Claude Desktop)

{
  "mcpServers": {
    "mcp-vl-msa-rs": {
      "command": "mcp-vl-msa-rs"
    }
  }
}

1. Run the command above in your terminal (Claude Code), or paste the JSON config into claude_desktop_config.json (Claude Desktop).

2. Replace any <placeholder> values with your API keys or paths.

3. Restart Claude. The MCP server and its tools appear automatically.

💡 Install the binary first: cargo install mcp-vl-msa-rs (or build from https://github.com/DioNanos/mcp-vl-msa-rs).

Use cases

Research AI / ML Creative

About

MCP Servers overview

# mcp-vl-msa-rs

A searchable long-term memory for AI agents, exposed as an MCP stdio server.
Index documents, notes and past conversations into collections; retrieve the
top-k relevant chunks for a query and inject the original text back to the
model; add or drop agent memories with `msa_remember` / `msa_forget`. Pure
Rust, BM25 over [tantivy](https://github.com/quickwit-oss/tantivy), zero ML
deps in the default build; optional in-process dense rerank.

Any MCP client (Claude Code, Codex, or anything speaking MCP stdio) gets the
same memory: a queryable corpus that survives across sessions and model swaps,
with no cloud account and no embedding service required. Use it to give an
agent durable recall over a knowledge base, a docs tree, or its own chat
history — retrieval that returns the original text, not just embeddings.

It is one half of a two-part memory: this server is the **library** (corpus
recall), its companion [mcp-memory-rs](https://github.com/DioNanos/mcp-memory-rs)
is the **notebook** (curated state). An agent that swaps models loses neither.

```mermaid
flowchart LR
    A["AI agent<br/>(any MCP client)"]
    A -->|"curated state<br/>read / write / sync"| M["mcp-memory-rs<br/><i>the notebook</i>"]
    A -->|"corpus recall<br/>index / search / fetch"| V["mcp-vl-msa-rs<br/><i>the library</i>"]
    M --- D1[("JSON categories<br/>SQLite FTS5")]
    V --- D2[("tantivy BM25<br/>collections")]
```

**The name**: `msa` is the retrieval pattern it borrows from the Memory Sparse
Attention paper (arXiv:2603.23516) — an *extrinsic* approximation, not the
neural model; distinct from MiniMax's MSA-architecture LLMs, which are
*intrinsic* (in-model) generators. `vl` is for Vivling (`codex-vl`), its first
adopter — but the server is fully AI-agnostic and depends on nothing from it.

**Status**: v0.4 — hybrid sparse+dense optional.

## Why

The original [Memory Sparse Attention](https://arxiv.org/abs/2603.23516) paper (EverMind-AI) describes an end-to-end trainable sparse attention layer over chunk-pooled KV caches. That is a neural artifact and is not portable to a pure-Rust MCP server. What *is* portable, and what this repo aims to deliver, is the MSA macro pattern:

1. **Chunked storage** of long-form text with a small fixed pool size (`P=64` words by default, mirroring the paper).
2. **Top-k sparse routing** over chunks (BM25 surrogate; learned routing is out of scope).
3. **Original text injection** (paper §4.3, ablation -37.1% without): `msa_search` returns chunks, `msa_fetch_doc` returns the full document.
4. **Memory Interleave** as a *protocol* (planned v0.4): the AI client orchestrates multi-hop retrieval through repeated tool calls with a server-side cursor.

Design and rationale are documented in the project notes (negative results, gate methodology); see [`docs/NEGATIVE_RESULTS.md`](docs/NEGATIVE_RESULTS.md).

## Tool surface

| Tool | Since | Description |
|---|---|---|
| `msa_index` | v0.1 | Index a document; existing chunks for `doc_id` are replaced. |
| `msa_search` | v0.1 | Top-k chunks, score normalized 0.0–1.0. |
| `msa_fetch_doc` | v0.1 | Full original text of a document. |
| `msa_delete` | v0.1 | Remove a document and all its chunks. |
| `msa_list_collections` | v0.1 | Collections open in the registry. |
| `msa_stats` | v0.1 | Per-collection statistics (exact `num_documents` / `total_tokens`). |
| `SearchFilter` | v0.2 | Metadata filter (`where_eq`/`where_in`/`created_*`), post-retrieval. |
| `msa_search_iterative` | v0.3 | Memory Interleave with server-side cursor; dedups across rounds. |
| `msa_drop_session` | v0.3 | Force-evict a Memory Interleave session before TTL. |
| `dense_alpha` on `msa_search` | v0.4 | Hybrid BM25 + cosine rerank. Requires `--features embeddings` + `[embeddings]` config. |
| `msa_remember` / `msa_forget` | v0.4 | Agent-memory surface: enrich + low-signal gate + content-hash dedup; standard metadata (`kind` / `source_id` / `created_at`). |
| `msa_sync_path` | v0.4 | Mirror a directory into a collection (filesystem source; blake3 delta sync). |

## Install

**Prebuilt binary** (recommended) — download the archive for your platform from
the [latest release](https://github.com/DioNanos/mcp-vl-msa-rs/releases/latest),
extract, and point your MCP client at the binary:

```bash
tar xzf mcp-vl-msa-rs-x86_64-unknown-linux-gnu.tar.gz
install -m755 mcp-vl-msa-rs-*/mcp-vl-msa-rs ~/.local/bin/
```

Prebuilt targets (Linux + Android): `x86_64-unknown-linux-gnu`,
`x86_64-unknown-linux-musl`, `aarch64-unknown-linux-gnu`,
`aarch64-unknown-linux-musl` (edge / ARM / Termux), `aarch64-linux-android`.

**macOS**: no prebuilt binary is shipped (it would need Apple code-signing).
Install from source instead — `cargo install` below compiles it on your Mac in
one command, no signing needed.

**From source** (Rust toolchain) — `--locked` is required (the workspace
`Cargo.lock` pins a working `time` / `tantivy-common` resolution; a fresh
resolve breaks the build), and `mcp-msa-server` is the package name (the
binary it installs is `mcp-vl-msa-rs`):

```bash
cargo install --git https://github.com/DioNanos/mcp-vl-msa-rs \
  --locked --features source-fs mcp-msa-server
```

## Build & test

```bash
cd mcp-vl-msa-rs

# Default: pure BM25, zero network deps
cargo build --release
cargo test

# Hybrid sparse + dense (in-process Candle rerank, no external service)
cargo build --release --features embeddings
cargo test  --features embeddings
```

### Hybrid mode config

Add `[embeddings]` to `MCP_MSA_CONFIG` to activate dense rerank. Without
this section the server stays in BM25-only mode even when the binary was
built with `--features embeddings`.

The production backend is `candle-modernbert`: the encoder runs **in-process**
(Candle), offline-deterministic, from a local model bundle — no daemon, no
network at runtime, no automatic downloads. Prepare the bundle once with
`scripts/prepare-granite-r2-97m.sh`.

```toml
[storage]
storage_dir = "~/.local/state/mcp-vl-msa-rs"

[chunking]
chunk_size = 64
overlap = 0

[embeddings]
backend   = "candle-modernbert"
model_dir = "~/.local/share/mcp-vl-msa-rs/models/granite-r2-97m"
dim       = 768
model_id  = "granite-r2-97m"
```

A transitional `backend = "ollama"` (HTTP to an Ollama-compatible service)
still exists but is **deprecated and scheduled for removal in v0.6** — do not
build new setups on it.

The AI client opts into hybrid scoring per-call by passing `dense_alpha`
to `msa_search` (or any future tool that supports it). `dense_alpha = 1.0`
(default) is BM25-only; `0.0` is dense-only; intermediate values are a
linear blend `α·bm25 + (1-α)·((cos+1)/2)`. Cosine is shifted to `[0,1]`
so it composes linearly with the already max-normalized BM25 score.

## Run as MCP stdio

```bash
# Default storage: ~/.local/state/mcp-vl-msa-rs/
./target/release/mcp-vl-msa-rs

# With explicit config
MCP_VL_MSA_CONFIG=~/.config/mcp-vl-msa-rs/config.toml \
MCP_DEVICE=my-node \
./target/release/mcp-vl-msa-rs
```

Example `~/.codex/config.toml` entry:

```toml
[mcp_servers.vl_msa]
command = "/path/to/mcp-vl-msa-rs/target/release/mcp-vl-msa-rs"
env = { MCP_DEVICE = "my-node" }
# let the model call tools without a per-call approval prompt
default_tools_approval_mode = "approve"
```

Equivalent `~/.claude.json` entry for Claude Code:

```json
{
  "mcpServers": {
    "vl_msa": {
      "command": "/path/to/mcp-vl-msa-rs/target/release/mcp-vl-msa-rs",
      "env": { "MCP_DEVICE": "my-node" }
    }
  }
}
```

### AI client compatibility

- Clients with partial MCP support may not surface the server's `instructions`
  text. The tool descriptions and request-field descriptions are self-contained,
  so a model can work from those alone.
- Read-only tools (`msa_search`, `msa_fetch_doc`, `msa_stats`,
  `msa_list_collections`, `msa_manifest`, `msa_search_iterative`,
  `msa_interleave_round`) carry the `readOnlyHint` annotation, which lets a
  gating client auto-approve them.
- If a model reports an "unsupported call" or "user cancelled" on codex, that is
  the approval gate, not a server fault — set `default_tools_approval_mode`
  (above) so tool calls are not blocked on a prompt.

## Storage layout

```
~/.local/state/mcp-vl-msa-rs/
├── <collection_a>/        ← tantivy index directory
├── <collection_b>/
└── ...
```

Each collection is an independent tantivy index. Collection names are validated
(rejected if they contain path separators, `..`, etc.) so a collection cannot
escape the root.

## Roadmap

Shipped:

- **v0.2** — `SearchFilter` (where_eq / where_in / created range), post-retrieval.
- **v0.3** — `msa_search_iterative` Memory Interleave with server-side cursor + TTL'd `MsaSession` registry.
- **v0.4** — hybrid BM25 + dense rerank behind feature flag `embeddings`, Ollama backend, per-call `dense_alpha`; agent-memory surface (`msa_remember` / `msa_forget`); filesystem source metadata (`created_at` / `source` / `ext` / `dir`) at index time; exact `num_documents` / `total_tokens` in `msa_stats`; `msa-bench` reproducible benchmark crate; prebuilt-binary packaging.

Next (not yet built):

- Query-time tantivy filter (today `SearchFilter` runs post-retrieval; fine for
  normal corpora, but a pre-filter would help when selectivity is high on a very
  large index).
- ACL for multi-tenant collections.
- Tool-description tuning.

## Related work

- **MSA paper** ([arXiv:2603.23516](https://arxiv.org/abs/2603.23516)) — the
  architectural inspiration (neural, intrinsic); this repo is an extrinsic,
  pure-Rust approximation of the macro pattern.
- **Vivling** (in `codex-vl`) — the first downstream consumer: this server is
  its long-term memory.
- **[mcp-memory-rs](https://github.com/DioNanos/mcp-memory-rs)** — the companion
  server for *curated* agent state (named JSON categories, per-device ACL,
  fleet sync). This server does corpus recall; together they cover both halves
  of agent memory: the curated notebook and the queryable library.

## License

Apache-2.0. See [LICENSE](./LICENSE).

Topics

ai-agentsbm25local-firstmcpmemoryragretrievalrusttantivy

Frequently asked

What people ask about mcp-vl-msa-rs

What is DioNanos/mcp-vl-msa-rs?

DioNanos/mcp-vl-msa-rs is mcp servers for the Claude AI ecosystem. Local-first MCP memory for AI agents — the searchable corpus (library). Pairs with mcp-memory-rs for curated state. It has 0 GitHub stars and was last updated today.

How do I install mcp-vl-msa-rs?

You can install mcp-vl-msa-rs by cloning the repository (https://github.com/DioNanos/mcp-vl-msa-rs) or following the README instructions on GitHub. ClaudeWave also provides quick install blocks on this page.

Is DioNanos/mcp-vl-msa-rs safe to use?

DioNanos/mcp-vl-msa-rs has not been audited yet by our security agent. Review the original repository on GitHub before using it in production.

Who maintains DioNanos/mcp-vl-msa-rs?

DioNanos/mcp-vl-msa-rs is maintained by DioNanos. The last recorded GitHub activity is from today, with 0 open issues.

Are there alternatives to mcp-vl-msa-rs?

Yes. On ClaudeWave you can browse similar mcp servers at /categories/mcp, sorted by popularity or recent activity.

1-click deploy

Deploy mcp-vl-msa-rs to your cloud

Ship this repo to production in minutes. Each platform spins up its own environment with editable env vars.

Vercel Railway Render

Embeddable badge

Maintain this repo? Add a badge to your README

Drop the badge into your GitHub README to show it's tracked on ClaudeWave. Each badge links back to this page and reflects the live Trust Score.

Markdown (README)

[![Featured on ClaudeWave](https://claudewave.com/api/badge/dionanos-mcp-vl-msa-rs)](https://claudewave.com/repo/dionanos-mcp-vl-msa-rs)

HTML

<a href="https://claudewave.com/repo/dionanos-mcp-vl-msa-rs"><img src="https://claudewave.com/api/badge/dionanos-mcp-vl-msa-rs" alt="Featured on ClaudeWave: DioNanos/mcp-vl-msa-rs" width="320" height="64" /></a>

More MCP Servers

mcp-vl-msa-rs alternatives

n8n-io

n8n

today

Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.

192.2k58.5kTypeScript

MCP ServersaiapisInstall

open-webui

today

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

141.3k20.3kPython

MCP ServersaillmInstall

google-gemini

gemini-cli

today

An open-source AI agent that brings the power of Gemini directly into your terminal.

105.2k14kTypeScript

MCP Serversaiai-agentsInstall

netdata

today

The fastest path to AI-powered full stack observability, even for lean teams.

79.1k6.4kC

MCP ServersaialertingInstall

D4Vinci

Scrapling

5d ago

🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!

63.3k6.2kPython

MCP Serversaiai-scrapingInstall

sansan0

TrendRadar

today

⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载，你的 AI 舆情监控助手与热点筛选工具！聚合多平台热点 + RSS 订阅，支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机，也支持接入 MCP 架构，赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ，数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。

59.4k24.6kPython

MCP ServersaibarkInstall