Unified observability gateway for AI agents — one MCP server for Prometheus, Loki, and any backend, with cross-signal anomaly detection and a built-in Web UI.
- ✓Open-source license (Apache-2.0)
- ✓Actively maintained (<30d)
- ✓Clear description
- ✓Topics declared
claude mcp add observability-mcp -- npx -y @thotischner/observability-mcp{
"mcpServers": {
"observability-mcp": {
"command": "npx",
"args": ["-y", "@thotischner/observability-mcp"]
}
}
}Resumen de MCP Servers
<div align="center">
# observability-mcp
**The unified observability gateway for AI agents.**
One MCP server that connects to any observability backend through pluggable connectors,
normalizes the data, adds robust anomaly analysis, and provides a web UI for configuration.
*One MCP endpoint, every backend — so an agent triaging an incident asks one normalized
question instead of juggling N vendor servers and their query languages.*
**0/10 → 10/10:** the same 8B local model goes from hallucinating blast-radius answers
to exactly correct ones once it gets this gateway's topology tools —
[measured, not asserted](docs/benchmark-astronomy-shop.md).
</div>
```bash
npx @thotischner/observability-mcp # start (UI on :3000)
claude mcp add observability --transport http http://localhost:3000/mcp # wire into Claude
```
Twelve read-only tools (`readOnlyHint: true` on every one) · server-side filter/aggregate
so agents get **numbers, not haystacks** · [For-Agents guide](https://thotischner.github.io/observability-mcp/for-agents/)
<div align="center">
[](LICENSE)
[](https://www.npmjs.com/package/@thotischner/observability-mcp)
[](https://www.npmjs.com/package/@thotischner/observability-mcp)
[](https://github.com/ThoTischner/observability-mcp/pkgs/container/observability-mcp)
[](https://github.com/ThoTischner/observability-mcp/actions/workflows/integration.yml)
[](https://github.com/ThoTischner/observability-mcp/stargazers)
[](https://modelcontextprotocol.io)
[](https://artifacthub.io/packages/search?repo=observability-mcp)
<details>
<summary>All badges — CI, Helm, supply chain (cosign / SBOM / SLSA / provenance)</summary>
<br>
[](https://github.com/ThoTischner/observability-mcp/actions/workflows/helm-integration.yml)
[](https://www.typescriptlang.org/)
[](./helm/observability-mcp)
[](https://docs.npmjs.com/generating-provenance-statements)
[](SECURITY.md#container-image--ghcr--scanned--cosign-signed--syft-sbom)
[](SECURITY.md#container-image--ghcr--scanned--cosign-signed--syft-sbom)
[](SECURITY.md#container-image--ghcr--scanned--cosign-signed--syft-sbom)
[](SECURITY.md#container-image--ghcr--scanned--cosign-signed--syft-sbom)
[](https://thotischner.github.io/observability-mcp/hub/)
</details>

</div>
---
📖 **Full documentation site:** <https://thotischner.github.io/observability-mcp/>
🔌 **Open in MCP Inspector** — one-line interactive explorer:
```bash
npx --yes @modelcontextprotocol/inspector \
--config <(npx --yes @thotischner/observability-mcp inspector-config)
```
## Why it matters — measured, not asserted
On a real Kubernetes-platform-team question ("which other pods share a node with
`payment-service` so we know what else falls over if that node goes down?"), the same
local model produces wildly different answers depending on the tools you hand it:
| Tools available to the agent (llama3.1:8b, n=10) | Cross-namespace blast-radius accuracy |
|---|:---:|
| Generic metric + log + service tools | **0 / 10** — hallucinates the wrong entity type (`prometheus`, `loki`, `kubernetes`) |
| Same model + `get_topology` + `get_blast_radius` | **10 / 10** — exact correct co-tenant list, every iteration |
Raw JSON for both arms, plus three more scenarios (single-service RCA, in-namespace
blast radius, scenarios where topology does *not* help), live in
[docs/benchmark-astronomy-shop.md](docs/benchmark-astronomy-shop.md). The harness is in
[`scripts/benchmark-rca.mjs`](scripts/benchmark-rca.mjs); re-run with `make benchmark-up && make benchmark-run`.
We don't claim universal speedup — the doc spells out exactly where the topology tools
help (graph-shaped questions) and where they don't (pure single-metric drill-downs).
---
## Try it in 10 seconds
```bash
npx @thotischner/observability-mcp
# then open http://localhost:3000
```
Wire it into Claude Code with one CLI call:
```bash
claude mcp add observability --transport http http://localhost:3000/mcp
```
…or commit it to your repo as `.mcp.json` (works the same in Claude Desktop / Cursor):
```json
{
"mcpServers": {
"observability": {
"transport": { "type": "http", "url": "http://localhost:3000/mcp" }
}
}
}
```
The server starts with **zero sources**. Add Prometheus/Loki via the Web UI or `PROMETHEUS_URL` / `LOKI_URL` env vars.
> If you'd rather have the snippets above printed by a Make target — including
> custom-host / custom-port substitution — use `make connect-claude-code` or
> `make connect-cursor`. `make doctor` round-trips a real MCP handshake against
> a running server, reports the live governance posture (auth mode, redaction,
> audit-log persistence, per-identity rate cap), and tells you what to fix if
> it can't.
> **Multi-user / production?** See [docs/access-control.md](docs/access-control.md)
> for the opt-in basic-mode login + RBAC + audit log + per-identity rate limit
> setup. All off by default; the demo above is unchanged.
>
> **SSO via OIDC?** `make demo-oidc` boots a Keycloak + an OIDC-flavored
> mcp-server on port **3001** with three pre-provisioned users
> (`admin` / `operator` / `viewer`, password = username, DEMO ONLY).
> See [docs/auth-oidc.md](docs/auth-oidc.md) for production Keycloak /
> Authentik / Auth0 / Azure AD setups.
>
> **External RBAC via OPA?** `make demo-opa` boots an Open Policy Agent
> with an example Rego policy + an OPA-backed mcp-server on port **3002**.
> See [docs/policy-engines.md](docs/policy-engines.md) for the
> built-in / file / OPA backend trade-offs and migration paths.
>
> **Curated MCP Products?** Set `OMCP_PRODUCTS_FILE` to a YAML catalog
> ([`config/products.yaml.example`](mcp-server/config/products.yaml.example))
> and ship per-tenant/per-agent tool bundles instead of "everything,
> all the time". RBAC-gated, audited, hot-editable. Details in
> [docs/products.md](docs/products.md).
Want the full chaos-engineering demo (Prometheus + Loki + 3 example services + the autonomous agent)? Clone and run:
```bash
make demo # equivalent to: docker compose --profile demo up --build --wait
```
Or run the **sovereign quickstart** — one command, fully on-prem, zero
external calls: it starts the stack, injects a real incident, and shows
side by side what an agent gets *without* vs *with* the analysis layer (a
wall of raw numbers vs a scored verdict that pinpoints the culprit). The
optional agent reasons over it with a **local** model (Ollama):
```bash
make demo-sovereign
```
See `make help` for all canonical workflows.
## Why?
Every observability vendor ships its own MCP server — Prometheus, Grafana, Datadog, Elastic, each siloed. An AI agent triaging an incident across systems must juggle N separate servers and learn each query language (PromQL, LogQL, …). There is no unified abstraction layer.
**observability-mcp** is that layer: one MCP endpoint that normalizes every backend and answers in plain service/metric/log terms, plus an analysis engine that flags anomalies the agent would otherwise have to reconstruct from raw queries itself.
**Who it's for:** SRE / platform teams running Prometheus + Loki who use an AI agent (Claude, local LLMs, …) for incident triage. The gateway's leverage is largest when the agent is *not* a frontier model — a smaller or local model that can't reliably hand-write PromQL/LogQL benefits most from normalized tools and pre-computed analysis. A strong frontier model can query raw backends competently on its own; there the value is consistency and the analysis engine, not query convenience. We state this honestly rather than claiming a universal speedup.
## Features
- **Unified gateway** — Single MCP endpoint for all your observability backends.
- **Cross-signal analysis** — Correlates metrics and logs automatically. Robust anomaly detection (median/MAD baseline, trend detection for slow ramps, warmup + dwell to suppress flapping) and weighted health scoring.
- **Web UI** — Sources, services, health monitoring, configuration. Real-time, dark theme.
- **prom-client defaults** — Works out of the box with the standard Node.js Prometheus instrumentation. Dynamic label resolution probes `job` / `service` / `app` / `service_name` so service filtering Just Works.
- **Loki label fallback** — Discovers services through `service_name` / `service` /Lo que la gente pregunta sobre observability-mcp
¿Qué es ThoTischner/observability-mcp?
+
ThoTischner/observability-mcp es mcp servers para el ecosistema de Claude AI. Unified observability gateway for AI agents — one MCP server for Prometheus, Loki, and any backend, with cross-signal anomaly detection and a built-in Web UI. Tiene 6 estrellas en GitHub y se actualizó por última vez today.
¿Cómo se instala observability-mcp?
+
Puedes instalar observability-mcp clonando el repositorio (https://github.com/ThoTischner/observability-mcp) o siguiendo las instrucciones del README en GitHub. ClaudeWave también te ofrece bloques de instalación rápida en esta misma página.
¿Es seguro usar ThoTischner/observability-mcp?
+
Nuestro agente de seguridad ha analizado ThoTischner/observability-mcp y le ha asignado un Trust Score de 87/100 (tier: Trusted). Revisa el desglose completo de comprobaciones superadas y flags en esta página.
¿Quién mantiene ThoTischner/observability-mcp?
+
ThoTischner/observability-mcp es mantenido por ThoTischner. La última actividad registrada en GitHub es de today, con 3 issues abiertos.
¿Hay alternativas a observability-mcp?
+
Sí. En ClaudeWave puedes explorar mcp servers similares en /categories/mcp, ordenados por popularidad o actividad reciente.
Despliega observability-mcp en tu cloud
Lleva este repo a producción en minutos. Cada plataforma genera su propio entorno con variables de entorno editables.
¿Mantienes este repo? Añade un badge a tu README
Pega el badge en tu README de GitHub para mostrar que está auditado por ClaudeWave. Cada badge enlaza de vuelta a esta página y muestra el Trust Score actual.
[](https://claudewave.com/repo/thotischner-observability-mcp)<a href="https://claudewave.com/repo/thotischner-observability-mcp"><img src="https://claudewave.com/api/badge/thotischner-observability-mcp" alt="Featured on ClaudeWave: ThoTischner/observability-mcp" width="320" height="64" /></a>Más MCP Servers
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
An open-source AI agent that brings the power of Gemini directly into your terminal.
The fastest path to AI-powered full stack observability, even for lean teams.
🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!
⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载,你的 AI 舆情监控助手与热点筛选工具!聚合多平台热点 + RSS 订阅,支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机,也支持接入 MCP 架构,赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ,数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。