Skip to main content
ClaudeWave
Slash Command304 estrellas del repoactualizado 2d ago

mantis-cve-diff

The mantis-cve-diff Claude Code command automates discovery of canonical fix commits for CVEs by searching multiple sources including OSV, NVD, GitHub, and GitLab, then clones the repository and generates a detailed patch analysis report. Use this when you need to quickly locate and examine the exact changes that remediate a specific CVE, with optional root cause classification and configurable search budget for complex cases.

Instalar en Claude Code
Copiar
mkdir -p ~/.claude/commands && curl -fsSL https://raw.githubusercontent.com/deonmenezes/mantishack/HEAD/.claude/commands/mantis-cve-diff.md -o ~/.claude/commands/mantis-cve-diff.md
Después abre una sesión nueva de Claude Code; el slash command carga automáticamente.

mantis-cve-diff.md

# /cve-diff — CVE Patch Discovery Pipeline

Discovers the canonical fix commit for a CVE, clones the repository, extracts the fix diff, and produces an analysis report. Fully agentic — an LLM tool-use loop searches OSV, NVD, GitHub, cgit, and GitLab to find the fix, then verifies it exists before submitting.

## Arguments

If the user provides a CVE ID (e.g. `/cve-diff CVE-2023-38545`), run immediately.
If the user types just `/cve-diff` with no argument, ask which CVE to analyze.
If the user types `/cve-diff health`, run the health check instead.

## Execution

Run the pipeline via the libexec wrapper (not the Typer CLI):

```bash
libexec/mantishack-cve-diff run <CVE-ID> [--output-dir DIR] [--budget-multiplier N] [--with-root-cause]
```

The script prints a JSON summary to stdout and errors to stderr.

**Options:**
- `--output-dir DIR` — explicit output directory (default: `out/cve-diff_<timestamp>/`)
- `--budget-multiplier N` — multiply the agent's token/cost/time budget (default: 1.0; use 2.0 to retry after a budget cap)
- `--with-root-cause` — run an additional LLM call to classify the vulnerability type and CWE

**Health check** (pre-flight):

```bash
libexec/mantishack-cve-diff health
```

Probes Anthropic API, NVD, GitHub, and GitLab reachability. Run before a batch of CVEs to catch outages early.

## Output

On success, the JSON summary on stdout looks like:

```json
{
  "ok": true,
  "cve_id": "CVE-2023-38545",
  "output_dir": "out/cve-diff_20260504_143012/",
  "bundle": {
    "repository_url": "https://github.com/curl/curl",
    "fix_commit": "fb4415d8aee6c10a4ce3328c42b9c2e4eb5bbafb",
    "shape": "source",
    "files_changed": 3,
    "diff_bytes": 1842
  },
  "telemetry": { "cost_usd": 0.12, "tokens": 18432, "elapsed_s": 34.2 },
  "artifacts": {
    "osv_json": "out/.../CVE-2023-38545.osv.json",
    "markdown": "out/.../CVE-2023-38545.md",
    "flow_md": "out/.../CVE-2023-38545.flow.md"
  }
}
```

**Artifacts written to the output directory:**

| File | Contents |
|------|----------|
| `<CVE>.md` | Human-readable analysis: diff summary, file changes, consensus |
| `<CVE>.osv.json` | Structured OSV-schema output with the fix pointer |
| `<CVE>.flow.md` | Pipeline trace showing each stage (discover → acquire → resolve → diff → render) |
| `<CVE>.flow.jsonl` | Machine-readable event log of the pipeline |
| `<CVE>.clone.patch` | Raw unified diff from the clone-based extractor |

## After the pipeline completes

1. Parse the JSON summary from stdout (it's the last thing printed).
2. Read `<CVE>.md` from the output directory — this is the main report.
3. Present key findings to the user:
   - Repository and fix commit
   - Diff shape and size (files changed, bytes)
   - Consensus status (if both OSV and NVD agree on the fix pointer)
   - Cost and elapsed time
4. If the user wants details, read `<CVE>.flow.md` for the full pipeline trace.

## Error handling

| Exit code | Error class | Meaning |
|-----------|-------------|---------|
| 0 | — | Success |
| 4 | UnsupportedSource | CVE points at a closed-source vendor |
| 5 | DiscoveryError | Agent couldn't find a canonical fix commit |
| 6 | AcquisitionError | Clone/fetch cascade failed for the discovered repo |
| 7 | IdenticalCommitsError | Fix and parent resolved to the same commit |
| 9 | AnalysisError | Diff shape rejected (notes_only, packaging_only) |

On failure, the JSON summary still prints with `"ok": false` and includes `error_class` + `error_detail`. Telemetry (cost, tokens, tool calls) is included when available so the user can see what the agent tried.

**Budget exhaustion (exit 5, reason `budget_*`):** The agent hit its token, cost, or time cap. Offer to re-run with `--budget-multiplier 2` (doubles all budgets). Don't auto-extend — let the user decide.

## Requirements

- `ANTHROPIC_API_KEY` — required for the agentic discovery loop
- `GITHUB_TOKEN` — recommended (avoids rate limits on GitHub API)
- `NVD_API_KEY` — optional (avoids NVD rate limits)

## Pipeline stages

```
discover → acquire → resolve → diff → render
    │          │         │        │       │
    ▼          ▼         ▼        ▼       ▼
  Agent     Clone/    Parent   Extract  Markdown
  loop      fetch     commit   fix^..   + OSV
  (LLM)     cascade   lookup   fix      JSON
```

1. **Discover** — agentic loop searches OSV, NVD, distro trackers, GitHub, cgit, GitLab for the fix commit. Verifies SHA existence before submitting.
2. **Acquire** — cascading clone: shallow clone → full clone → API fallback.
3. **Resolve** — expand short SHA, find parent commit.
4. **Diff** — `git diff fix^..fix` with per-file source extraction.
5. **Render** — markdown report, OSV JSON, 2-method consensus check.
api-abuse-fuzzerSubagent

Use this agent when the target is a LIVE REST or GraphQL API you are authorized to test and the question is "can I tamper request bodies, headers, ids, and tokens to read or act on data that isn't mine?" — active, request-driven abuse of the API contract, not static code review. It drives REAL HTTP at the endpoints: BOLA/IDOR object-id enumeration (increment/swap/UUID-shuffle the id and diff the access decision), broken function-level authz (replay an admin verb/path with a low-priv token), mass-assignment (inject role/is_admin/is_verified/owner_id into the JSON body), excessive-data-exposure (the response over-returns fields the UI never shows), GraphQL introspection + alias/batch amplification + nested-query DoS, content-type and HTTP-verb tampering (POST→PUT/PATCH/DELETE, application/json→text/plain→x-www-form-urlencoded), JWT/session/token swap across two users, and rate-limit / idempotency-key bypass. It proves every finding with a behavioral oracle — a status/length/timing/field-set diff between the authorized baseline and the tampered request — never a guess. Prefer this agent over a code reader when you hold a base URL or a schema and want to mutate live traffic methodically.\n\n<example>\nContext: The user has a running API with numeric resource ids and two test accounts.\nuser: "Here's our staging API at https://api.staging.acme.test and tokens for user A and user B — can user A read user B's orders?"\nassistant: "That's textbook BOLA: same endpoint, swap the object id (or the bearer token) and diff the access decision. I'll use the Task tool to launch the api-abuse-fuzzer agent to enumerate /orders/{id} with A's token against B's ids and prove the cross-tenant read with a status + ownership-field oracle."\n<agent_launch>\nDelegating to api-abuse-fuzzer: a live authorized API + two tokens + object-id enumeration is its core BOLA/IDOR mission.\n</agent_launch>\n</example>\n\n<example>\nContext: The user exposes a GraphQL endpoint and isn't sure introspection or query batching is locked down.\nuser: "Our /graphql is behind auth but I want to know if a low-priv user can pull admin fields, brute force via aliases, or knock it over with a deep nested query."\nassistant: "GraphQL abuse surface: introspect the schema, alias-batch a login/lookup to bypass per-request rate limits, and send a bounded cyclic nested query as a timing oracle. I'll launch the api-abuse-fuzzer agent to tamper the operation and measure the depth/timing oracle."\n<agent_launch>\nDelegating to api-abuse-fuzzer for GraphQL introspection, alias/batch amplification, and nested-query DoS against the live endpoint.\n</agent_launch>\n</example>\n\nProactively suggest using this agent when: a live base URL + an OpenAPI/Swagger/GraphQL schema (or a captured request) is in hand and the target is authorized in-scope; endpoints take a resource identifier in the path/query/body (/users/{id}, ?account=, {"order_id": ...}) — BOLA/IDOR territory; the user holds 2+ accounts or tokens (low-priv + high-priv, tenant A + tenant B) to run an authorization differential; there are admin/privileged verbs (DELETE, PUT /admin/*, role-changing mutations) and you want to hit them as a non-admin; a write endpoint accepts a JSON object — test mass-assignment of role/is_admin/verified/balance/owner_id; a /graphql endpoint exists (introspection, alias/batch abuse, nested-query DoS, field-level authz); or the user mentions rate limiting, coupon/OTP brute force, idempotency keys, BOLA, BFLA, mass assignment, or "excessive data exposure".

assumption-pressure-testSubagent

Use this agent when a codebase, PR, or service needs its IMPLICIT TRUST ASSUMPTIONS enumerated and attacked — every place the code silently trusts a header, an "internal-only" route, an "already-validated-upstream" input, an ID that "belongs to the caller," or a value that "can't be null/negative." It specializes in confused-deputy (CWE-441), mass-assignment (CWE-915), input-validation gaps (CWE-20), and second-order/stored injection (CWE-89/CWE-79). This is the agent for trust-boundary audits, multi-service request paths, ORM/serializer review, and proxy/gateway/header-forwarding code.\n\n<example>\nContext: User has a multi-service backend where an edge proxy forwards auth context downstream.\nuser: "Our gateway sets X-User-Id from the JWT and the internal services read it to authorize. Can you check this?"\nassistant: "This is a classic confused-deputy / trust-boundary surface — the downstream services trust a header they can't verify. I'll use the Task tool to launch the assumption-pressure-test agent to enumerate every internal endpoint that reads X-User-Id and prove whether an attacker can reach it directly and forge it."\n<agent_launch>\nDelegating to assumption-pressure-test: the request hinges on an implicit 'this header is trusted because the gateway set it' assumption that must be attacked.\n</agent_launch>\n</example>\n\n<example>\nContext: User just merged an ORM update endpoint.\nuser: "Here's the new profile update handler: User.objects.filter(id=request.user.id).update(**request.data)"\nassistant: "That `**request.data` spread is a mass-assignment sink — it trusts that the request body only contains the fields you intended. I'll launch the assumption-pressure-test agent to map which model columns (is_admin, balance, role) become attacker-writable and confirm reachability."\n<agent_launch>\nDelegating to assumption-pressure-test for the CWE-915 mass-assignment and the implicit 'the body only has safe fields' assumption.\n</agent_launch>\n</example>\n\nProactively suggest using this agent when:\n- Code reads request headers (X-Forwarded-For, X-User-Id, X-Real-IP, X-Internal-*, Host) for trust or authorization decisions\n- A serializer/ORM uses bulk binding: `**req.body`, `Object.assign`, `ModelMapper`, `BeanUtils.copyProperties`, `update_attributes`, `params.permit!`\n- Comments or names assert trust: "internal only", "already validated", "trusted", "comes from gateway", "sanitized upstream"\n- Data is stored then later concatenated into SQL/HTML/shell (second-order injection)\n- An endpoint takes an `id`/`uuid`/`account`/`order` param that maps to a resource (IDOR / object ownership)

coverage-analyzerSubagent

Generate gcov coverage data for a code repository.

crash-analysis-agentSubagent

Analyze security bugs from any C/C++ project with full root-cause tracing

crash-analyzerSubagent

Analyze crashes using rr recordings, function traces, and coverage data to produce root-cause analyses.

crash-analysis-checkerSubagent

Carefully analyze root cause analysis reports for crashes to make sure they are correct

exploitability-validator-agentSubagent

Multi-stage pipeline to validate vulnerability findings are real, reachable, and exploitable

federated-identity-breakerSubagent

|