mcp-vl-msa-rs

Name: DioNanos/mcp-vl-msa-rs
Author: DioNanos

Local-first MCP memory for AI agents — the searchable corpus (library). Pairs with mcp-memory-rs for curated state.

MCP ServersRegistry oficial0 estrellas0 forks● RustApache-2.0Actualizado today

Install in Claude Code / Claude Desktop

Method: Manual · mcp-vl-msa-rs

Claude Code CLI

git clone https://github.com/DioNanos/mcp-vl-msa-rs

claude_desktop_config.json (Claude Desktop)

{
  "mcpServers": {
    "mcp-vl-msa-rs": {
      "command": "mcp-vl-msa-rs"
    }
  }
}

1. Run the command above in your terminal (Claude Code), or paste the JSON config into claude_desktop_config.json (Claude Desktop).

2. Replace any <placeholder> values with your API keys or paths.

3. Restart Claude. The MCP server and its tools appear automatically.

💡 Install the binary first: cargo install mcp-vl-msa-rs (or build from https://github.com/DioNanos/mcp-vl-msa-rs).

Casos de uso

Research AI / ML Creative

Sobre el repo

Resumen de MCP Servers

# mcp-vl-msa-rs

A searchable long-term memory for AI agents, exposed as an MCP stdio server.
Index documents, notes and past conversations into collections; retrieve the
top-k relevant chunks for a query and inject the original text back to the
model; add or drop agent memories with `msa_remember` / `msa_forget`. Pure
Rust, BM25 over [tantivy](https://github.com/quickwit-oss/tantivy), zero ML
deps in the default build; optional in-process dense rerank.

Any MCP client (Claude Code, Codex, or anything speaking MCP stdio) gets the
same memory: a queryable corpus that survives across sessions and model swaps,
with no cloud account and no embedding service required. Use it to give an
agent durable recall over a knowledge base, a docs tree, or its own chat
history — retrieval that returns the original text, not just embeddings.

It is one half of a two-part memory: this server is the **library** (corpus
recall), its companion [mcp-memory-rs](https://github.com/DioNanos/mcp-memory-rs)
is the **notebook** (curated state). An agent that swaps models loses neither.

```mermaid
flowchart LR
    A["AI agent<br/>(any MCP client)"]
    A -->|"curated state<br/>read / write / sync"| M["mcp-memory-rs<br/><i>the notebook</i>"]
    A -->|"corpus recall<br/>index / search / fetch"| V["mcp-vl-msa-rs<br/><i>the library</i>"]
    M --- D1[("JSON categories<br/>SQLite FTS5")]
    V --- D2[("tantivy BM25<br/>collections")]
```

**The name**: `msa` is the retrieval pattern it borrows from the Memory Sparse
Attention paper (arXiv:2603.23516) — an *extrinsic* approximation, not the
neural model; distinct from MiniMax's MSA-architecture LLMs, which are
*intrinsic* (in-model) generators. `vl` is for Vivling (`codex-vl`), its first
adopter — but the server is fully AI-agnostic and depends on nothing from it.

**Status**: v0.4 — hybrid sparse+dense optional.

## Why

The original [Memory Sparse Attention](https://arxiv.org/abs/2603.23516) paper (EverMind-AI) describes an end-to-end trainable sparse attention layer over chunk-pooled KV caches. That is a neural artifact and is not portable to a pure-Rust MCP server. What *is* portable, and what this repo aims to deliver, is the MSA macro pattern:

1. **Chunked storage** of long-form text with a small fixed pool size (`P=64` words by default, mirroring the paper).
2. **Top-k sparse routing** over chunks (BM25 surrogate; learned routing is out of scope).
3. **Original text injection** (paper §4.3, ablation -37.1% without): `msa_search` returns chunks, `msa_fetch_doc` returns the full document.
4. **Memory Interleave** as a *protocol* (planned v0.4): the AI client orchestrates multi-hop retrieval through repeated tool calls with a server-side cursor.

Design and rationale are documented in the project notes (negative results, gate methodology); see [`docs/NEGATIVE_RESULTS.md`](docs/NEGATIVE_RESULTS.md).

## Tool surface

| Tool | Since | Description |
|---|---|---|
| `msa_index` | v0.1 | Index a document; existing chunks for `doc_id` are replaced. |
| `msa_search` | v0.1 | Top-k chunks, score normalized 0.0–1.0. |
| `msa_fetch_doc` | v0.1 | Full original text of a document. |
| `msa_delete` | v0.1 | Remove a document and all its chunks. |
| `msa_list_collections` | v0.1 | Collections open in the registry. |
| `msa_stats` | v0.1 | Per-collection statistics (exact `num_documents` / `total_tokens`). |
| `SearchFilter` | v0.2 | Metadata filter (`where_eq`/`where_in`/`created_*`), post-retrieval. |
| `msa_search_iterative` | v0.3 | Memory Interleave with server-side cursor; dedups across rounds. |
| `msa_drop_session` | v0.3 | Force-evict a Memory Interleave session before TTL. |
| `dense_alpha` on `msa_search` | v0.4 | Hybrid BM25 + cosine rerank. Requires `--features embeddings` + `[embeddings]` config. |
| `msa_remember` / `msa_forget` | v0.4 | Agent-memory surface: enrich + low-signal gate + content-hash dedup; standard metadata (`kind` / `source_id` / `created_at`). |
| `msa_sync_path` | v0.4 | Mirror a directory into a collection (filesystem source; blake3 delta sync). |

## Install

**Prebuilt binary** (recommended) — download the archive for your platform from
the [latest release](https://github.com/DioNanos/mcp-vl-msa-rs/releases/latest),
extract, and point your MCP client at the binary:

```bash
tar xzf mcp-vl-msa-rs-x86_64-unknown-linux-gnu.tar.gz
install -m755 mcp-vl-msa-rs-*/mcp-vl-msa-rs ~/.local/bin/
```

Prebuilt targets (Linux + Android): `x86_64-unknown-linux-gnu`,
`x86_64-unknown-linux-musl`, `aarch64-unknown-linux-gnu`,
`aarch64-unknown-linux-musl` (edge / ARM / Termux), `aarch64-linux-android`.

**macOS**: no prebuilt binary is shipped (it would need Apple code-signing).
Install from source instead — `cargo install` below compiles it on your Mac in
one command, no signing needed.

**From source** (Rust toolchain) — `--locked` is required (the workspace
`Cargo.lock` pins a working `time` / `tantivy-common` resolution; a fresh
resolve breaks the build), and `mcp-msa-server` is the package name (the
binary it installs is `mcp-vl-msa-rs`):

```bash
cargo install --git https://github.com/DioNanos/mcp-vl-msa-rs \
  --locked --features source-fs mcp-msa-server
```

## Build & test

```bash
cd mcp-vl-msa-rs

# Default: pure BM25, zero network deps
cargo build --release
cargo test

# Hybrid sparse + dense (in-process Candle rerank, no external service)
cargo build --release --features embeddings
cargo test  --features embeddings
```

### Hybrid mode config

Add `[embeddings]` to `MCP_MSA_CONFIG` to activate dense rerank. Without
this section the server stays in BM25-only mode even when the binary was
built with `--features embeddings`.

The production backend is `candle-modernbert`: the encoder runs **in-process**
(Candle), offline-deterministic, from a local model bundle — no daemon, no
network at runtime, no automatic downloads. Prepare the bundle once with
`scripts/prepare-granite-r2-97m.sh`.

```toml
[storage]
storage_dir = "~/.local/state/mcp-vl-msa-rs"

[chunking]
chunk_size = 64
overlap = 0

[embeddings]
backend   = "candle-modernbert"
model_dir = "~/.local/share/mcp-vl-msa-rs/models/granite-r2-97m"
dim       = 768
model_id  = "granite-r2-97m"
```

A transitional `backend = "ollama"` (HTTP to an Ollama-compatible service)
still exists but is **deprecated and scheduled for removal in v0.6** — do not
build new setups on it.

The AI client opts into hybrid scoring per-call by passing `dense_alpha`
to `msa_search` (or any future tool that supports it). `dense_alpha = 1.0`
(default) is BM25-only; `0.0` is dense-only; intermediate values are a
linear blend `α·bm25 + (1-α)·((cos+1)/2)`. Cosine is shifted to `[0,1]`
so it composes linearly with the already max-normalized BM25 score.

## Run as MCP stdio

```bash
# Default storage: ~/.local/state/mcp-vl-msa-rs/
./target/release/mcp-vl-msa-rs

# With explicit config
MCP_VL_MSA_CONFIG=~/.config/mcp-vl-msa-rs/config.toml \
MCP_DEVICE=my-node \
./target/release/mcp-vl-msa-rs
```

Example `~/.codex/config.toml` entry:

```toml
[mcp_servers.vl_msa]
command = "/path/to/mcp-vl-msa-rs/target/release/mcp-vl-msa-rs"
env = { MCP_DEVICE = "my-node" }
# let the model call tools without a per-call approval prompt
default_tools_approval_mode = "approve"
```

Equivalent `~/.claude.json` entry for Claude Code:

```json
{
  "mcpServers": {
    "vl_msa": {
      "command": "/path/to/mcp-vl-msa-rs/target/release/mcp-vl-msa-rs",
      "env": { "MCP_DEVICE": "my-node" }
    }
  }
}
```

### AI client compatibility

- Clients with partial MCP support may not surface the server's `instructions`
  text. The tool descriptions and request-field descriptions are self-contained,
  so a model can work from those alone.
- Read-only tools (`msa_search`, `msa_fetch_doc`, `msa_stats`,
  `msa_list_collections`, `msa_manifest`, `msa_search_iterative`,
  `msa_interleave_round`) carry the `readOnlyHint` annotation, which lets a
  gating client auto-approve them.
- If a model reports an "unsupported call" or "user cancelled" on codex, that is
  the approval gate, not a server fault — set `default_tools_approval_mode`
  (above) so tool calls are not blocked on a prompt.

## Storage layout

```
~/.local/state/mcp-vl-msa-rs/
├── <collection_a>/        ← tantivy index directory
├── <collection_b>/
└── ...
```

Each collection is an independent tantivy index. Collection names are validated
(rejected if they contain path separators, `..`, etc.) so a collection cannot
escape the root.

## Roadmap

Shipped:

- **v0.2** — `SearchFilter` (where_eq / where_in / created range), post-retrieval.
- **v0.3** — `msa_search_iterative` Memory Interleave with server-side cursor + TTL'd `MsaSession` registry.
- **v0.4** — hybrid BM25 + dense rerank behind feature flag `embeddings`, Ollama backend, per-call `dense_alpha`; agent-memory surface (`msa_remember` / `msa_forget`); filesystem source metadata (`created_at` / `source` / `ext` / `dir`) at index time; exact `num_documents` / `total_tokens` in `msa_stats`; `msa-bench` reproducible benchmark crate; prebuilt-binary packaging.

Next (not yet built):

- Query-time tantivy filter (today `SearchFilter` runs post-retrieval; fine for
  normal corpora, but a pre-filter would help when selectivity is high on a very
  large index).
- ACL for multi-tenant collections.
- Tool-description tuning.

## Related work

- **MSA paper** ([arXiv:2603.23516](https://arxiv.org/abs/2603.23516)) — the
  architectural inspiration (neural, intrinsic); this repo is an extrinsic,
  pure-Rust approximation of the macro pattern.
- **Vivling** (in `codex-vl`) — the first downstream consumer: this server is
  its long-term memory.
- **[mcp-memory-rs](https://github.com/DioNanos/mcp-memory-rs)** — the companion
  server for *curated* agent state (named JSON categories, per-device ACL,
  fleet sync). This server does corpus recall; together they cover both halves
  of agent memory: the curated notebook and the queryable library.

## License

Apache-2.0. See [LICENSE](./LICENSE).

Topics

ai-agentsbm25local-firstmcpmemoryragretrievalrusttantivy

Preguntas frecuentes

Lo que la gente pregunta sobre mcp-vl-msa-rs

¿Qué es DioNanos/mcp-vl-msa-rs?

DioNanos/mcp-vl-msa-rs es mcp servers para el ecosistema de Claude AI. Local-first MCP memory for AI agents — the searchable corpus (library). Pairs with mcp-memory-rs for curated state. Tiene 0 estrellas en GitHub y se actualizó por última vez today.

¿Cómo se instala mcp-vl-msa-rs?

Puedes instalar mcp-vl-msa-rs clonando el repositorio (https://github.com/DioNanos/mcp-vl-msa-rs) o siguiendo las instrucciones del README en GitHub. ClaudeWave también te ofrece bloques de instalación rápida en esta misma página.

¿Es seguro usar DioNanos/mcp-vl-msa-rs?

DioNanos/mcp-vl-msa-rs aún no ha sido auditado por nuestro agente de seguridad. Revisa el repositorio original en GitHub antes de usarlo en producción.

¿Quién mantiene DioNanos/mcp-vl-msa-rs?

DioNanos/mcp-vl-msa-rs es mantenido por DioNanos. La última actividad registrada en GitHub es de today, con 0 issues abiertos.

¿Hay alternativas a mcp-vl-msa-rs?

Sí. En ClaudeWave puedes explorar mcp servers similares en /categories/mcp, ordenados por popularidad o actividad reciente.

Deploy en 1 click

Despliega mcp-vl-msa-rs en tu cloud

Lleva este repo a producción en minutos. Cada plataforma genera su propio entorno con variables de entorno editables.

Vercel Railway Render

Badge embebible

¿Mantienes este repo? Añade un badge a tu README

Pega el badge en tu README de GitHub para mostrar que está auditado por ClaudeWave. Cada badge enlaza de vuelta a esta página y muestra el Trust Score actual.

Markdown (README)

[![Featured on ClaudeWave](https://claudewave.com/api/badge/dionanos-mcp-vl-msa-rs)](https://claudewave.com/repo/dionanos-mcp-vl-msa-rs)

HTML

<a href="https://claudewave.com/repo/dionanos-mcp-vl-msa-rs"><img src="https://claudewave.com/api/badge/dionanos-mcp-vl-msa-rs" alt="Featured on ClaudeWave: DioNanos/mcp-vl-msa-rs" width="320" height="64" /></a>

Relacionados

Más MCP Servers

Alternativas a mcp-vl-msa-rs

n8n-io

n8n

today

Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.

192.2k58.5kTypeScript

MCP ServersaiapisInstall

open-webui

today

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

141.3k20.3kPython

MCP ServersaillmInstall

google-gemini

gemini-cli

today

An open-source AI agent that brings the power of Gemini directly into your terminal.

105.2k14kTypeScript

MCP Serversaiai-agentsInstall

netdata

today

The fastest path to AI-powered full stack observability, even for lean teams.

79.1k6.4kC

MCP ServersaialertingInstall

D4Vinci

Scrapling

5d ago

🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!

63.3k6.2kPython

MCP Serversaiai-scrapingInstall

sansan0

TrendRadar

today

⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载，你的 AI 舆情监控助手与热点筛选工具！聚合多平台热点 + RSS 订阅，支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机，也支持接入 MCP 架构，赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ，数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。

59.4k24.6kPython

MCP ServersaibarkInstall