mcp-security-reviewer
This Claude Code skill conducts a structured security assessment before integrating a new MCP server into an agent system. It examines source code integrity, catalogs exposed tools and resources with risk classifications, identifies network endpoints, validates permission scopes, ensures output sanitization, and generates approval requirements, producing a documented security review that gates high-risk capabilities behind human approval.
git clone --depth 1 https://github.com/oxbshw/LLM-Agents-Ecosystem-Handbook /tmp/mcp-security-reviewer && cp -r /tmp/mcp-security-reviewer/skills/examples/mcp-security-reviewer ~/.claude/skills/mcp-security-reviewerSKILL.md
# MCP Security Reviewer ## When to use - A new MCP server is being added to an agent - An MCP server version is being bumped - An incident triggered a re-review ## Inputs | Name | Type | Required | Notes | |---|---|---|---| | `repo_url` | string | yes | the MCP server's source | | `version` | string | yes | tag or commit SHA being adopted | | `intended_use` | string | yes | one paragraph: what we'll let it do | ## Workflow 1. **Source review**: clone at the pinned version; check for unexpected files / scripts 2. **Capabilities**: list every tool and resource exposed; map to risk levels (`references/mcp-risk-matrix.md`) 3. **Network**: identify outbound endpoints; document and assess each 4. **Permissions**: minimum required scopes / tokens; document over-permissions 5. **Output handling**: confirm the agent treats tool output as untrusted (sanitization, no execution) 6. **Approvals**: define which tools require human approval 7. **Produce filled `MCP_SERVER.md`** in `mcp/<server>.md` ## References - [`references/mcp-risk-matrix.md`](references/mcp-risk-matrix.md) ## Success criteria - All tools labelled by risk - High/Critical tools gated by approval - Pinned version (no `latest` / floating refs) - Documented network egress ## Failure modes - Source unavailable / un-pinnable → reject - Discovered hidden tool not in docs → reject and report upstream
Use when capturing an architecture decision so it survives turnover — produces an ADR-NNNN.md from context, options considered, and the chosen path.
Use when reviewing a proposed REST or GraphQL API change before merge — checks contract clarity, backwards compatibility, errors, pagination, auth, and naming.
Use when first encountering a new dataset — produces a structured profile (schema, missingness, distributions, outliers, gotchas) before any analysis.
Use after an incident is resolved — drafts a blameless postmortem from timeline notes, alerts, and chat threads.
Use when opening a PR — produces a clean PR description (what / why / how to verify / risks) from a branch diff against base.
Use when planning the next sprint — turns ticket intake + team capacity into a planned sprint with explicit non-goals.
Use after a session to promote useful episodic notes from logs/episodic/ into distilled, dated entries in MEMORY.md and memory/semantic/.