Skip to main content
ClaudeWave
Slash Command304 repo starsupdated 2d ago

mantis-understand

The mantis-understand command performs adversarial security analysis on code by mapping attack surfaces, tracing data flows, and identifying vulnerability patterns. Use it when conducting threat modeling, security audits, or vulnerability research to gain comprehensive understanding of how data moves through a codebase and where potential security weaknesses exist. Supports single-model in-session analysis or multi-model comparative analysis across different LLMs for higher-confidence findings.

Install in Claude Code
Copy
mkdir -p ~/.claude/commands && curl -fsSL https://raw.githubusercontent.com/deonmenezes/mantishack/HEAD/.claude/commands/mantis-understand.md -o ~/.claude/commands/mantis-understand.md
Then start a new Claude Code session; the slash command loads automatically.

mantis-understand.md

# /understand - MANTISHACK Code Understanding

You cannot find bugs if you don't have a deep, adversarial code understanding and comprehension for said codebase. This helps map the attack surface, trace data flows, hunt for vulnerability variants and so much more.....

It is a work in progress, remember that. 

## Usage

```
/understand <target> [--map] [--trace <entry>] [--hunt <pattern>] [--teach <subject>]
                     [--out <dir>] [--model <name> ...]
```

If no mode flag is given, default to `--map`.

### Multi-model mode (opt-in)

For `--hunt` and `--trace`, you can pass one or more `--model` flags to
run independent analyses across multiple LLMs and correlate the results.
Each model produces its own findings; the substrate identifies items
where models agree (high confidence) vs. disagree (worth a closer look).

```
/understand <target> --hunt "<pattern>" --model claude-opus-4-7 --model gpt-5
/understand <target> --trace traces.json --model claude-opus-4-7 --model gpt-5
```

**When to dispatch to libexec instead of running in-session:** if the
user passes `--model` AND the mode is `--hunt` or `--trace`, you MUST
run the work via `libexec/mantishack-understand` (multi-model substrate)
rather than doing the analysis here. Without `--model`, or for `--map`
/ `--teach` regardless, follow the in-session workflow below.

## Execution

**Multi-model path (when `--model` is present with `--hunt` or `--trace`):**

```bash
libexec/mantishack-understand --hunt "<pattern>" --target <resolved_target> \
    --out "$OUTPUT_DIR" --model <name> [--model <name> ...]
```

For `--trace`, point at a JSON file containing the trace list:
```bash
libexec/mantishack-understand --trace <traces.json> --target <resolved_target> \
    --out "$OUTPUT_DIR" --model <name> [--model <name> ...]
```

The shim writes `hunt-result.json` or `trace-result.json` to `$OUTPUT_DIR`
and prints a one-screen summary. After it returns, surface the summary
to the user and point them at the result file.

**In-session path (no `--model`, or `--map` / `--teach`):**

**Step 1: Start the run and build inventory:**
```bash
libexec/mantishack-run-lifecycle start understand --target <resolved_target>
```
The last line of output is `OUTPUT_DIR=<path>` — use that for all subsequent steps.

```bash
libexec/mantishack-build-checklist <resolved_target> "$OUTPUT_DIR"
```

**Step 2: Do the analysis** (map, trace, hunt, teach — see skill files).

**Step 3: Record coverage** (for `--map` — list every item you examined):

Write a JSON file listing every function, global, struct, and macro you analysed, then pass it to the coverage tool:
```json
// $OUTPUT_DIR/reviewed-items.json
[
  {"file": "src/auth.c", "item": "check_pw"},
  {"file": "src/auth.c", "item": "credentials"},
  {"file": "src/db.c", "item": "query"}
]
```
```bash
libexec/mantishack-coverage-summary "$OUTPUT_DIR" --mark-file "$OUTPUT_DIR/reviewed-items.json"
```

**Step 4: Generate diagrams** (for `--map` or `--trace`):
```bash
libexec/mantishack-render-diagrams "$OUTPUT_DIR"
```

**Step 4.5: Synthesise per-function annotations** (for `--map` or `--trace`):
```bash
libexec/mantishack-understand-annotate "$OUTPUT_DIR"
```
Reads `context-map.json` + any `flow-trace-*.json`, attaches per-function
annotations under `$OUTPUT_DIR/annotations/` for entry points, sinks,
trust boundaries, unchecked flows, and trace steps. Best-effort — exits
0 with "nothing to synthesise" when no inputs are present.

**Step 5: Complete the run.** Replace `<your-model-id>` with your exact model ID from your system prompt (e.g. `claude-opus-4-7`) — it records which model performed the analysis, which only you (the harness) know (MANTISHACK's Python can't read `/model`). If you don't know your model ID, drop the `--model` flag entirely; the run still completes, the model is just left unrecorded.
```bash
libexec/mantishack-run-lifecycle complete "$OUTPUT_DIR" --model <your-model-id>
```

**On failure** (at any point):
```bash
libexec/mantishack-run-lifecycle fail "$OUTPUT_DIR" "error description"
```

## Modes

| Flag | What it does |
|------|-------------|
| `--map` | Build context: entry points, trust boundaries, sinks |
| `--trace <entry>` | Trace one data flow source → sink with full call chain |
| `--hunt <pattern>` | Find all variants of a pattern across the codebase |
| `--teach <subject>` | Explain a framework, library, or code pattern in depth |

Modes combine and run in order: map → trace → hunt → teach. This matches the natural attack progression, so build context first, then trace a specific flow, then hunt for variants. Running `--map --trace EP-001` first maps, then traces the specified entry point.

## Examples

```
# Understand a codebase before scanning it
/understand ./src --map

# Trace a specific endpoint's data flow
/understand ./src --trace "POST /api/v2/query"

# Find all variants of a finding from validation
/understand ./src --hunt FIND-001

# Understand an unfamiliar pattern before tracing
/understand ./src --teach SQLAlchemy

# Full workflow: map, then trace highest-risk flow
/understand ./src --map --trace EP-001

# Hunt for variants, write output for validator to consume
/understand ./src --hunt "cursor.execute with f-string" --out .out/my-validation/
```

## Integration with Validation Pipeline

**Shared inventory:** `--map` runs `build_checklist()` first (MAP-0 step) to produce `checklist.json` with SHA-256 checksums. This is the same inventory used by `/validate` Stage 0. Coverage tracking is cumulative across both skills.

Understanding output feeds into Gadi & JC's epic exploitability validation:

- `checklist.json` → shared source inventory with coverage tracking
- `context-map.json` → pre-populates `attack-surface.json` for Stage B
- `flow-trace-*.json` → confirms reachability for Stage C
- `variants.json` → expands `checklist.json` scope for Stage 0

**Automatic bridge:** `/validate` Stage 0 automatically finds and imports `/understand` output. No `--out` alignment
api-abuse-fuzzerSubagent

Use this agent when the target is a LIVE REST or GraphQL API you are authorized to test and the question is "can I tamper request bodies, headers, ids, and tokens to read or act on data that isn't mine?" — active, request-driven abuse of the API contract, not static code review. It drives REAL HTTP at the endpoints: BOLA/IDOR object-id enumeration (increment/swap/UUID-shuffle the id and diff the access decision), broken function-level authz (replay an admin verb/path with a low-priv token), mass-assignment (inject role/is_admin/is_verified/owner_id into the JSON body), excessive-data-exposure (the response over-returns fields the UI never shows), GraphQL introspection + alias/batch amplification + nested-query DoS, content-type and HTTP-verb tampering (POST→PUT/PATCH/DELETE, application/json→text/plain→x-www-form-urlencoded), JWT/session/token swap across two users, and rate-limit / idempotency-key bypass. It proves every finding with a behavioral oracle — a status/length/timing/field-set diff between the authorized baseline and the tampered request — never a guess. Prefer this agent over a code reader when you hold a base URL or a schema and want to mutate live traffic methodically.\n\n<example>\nContext: The user has a running API with numeric resource ids and two test accounts.\nuser: "Here's our staging API at https://api.staging.acme.test and tokens for user A and user B — can user A read user B's orders?"\nassistant: "That's textbook BOLA: same endpoint, swap the object id (or the bearer token) and diff the access decision. I'll use the Task tool to launch the api-abuse-fuzzer agent to enumerate /orders/{id} with A's token against B's ids and prove the cross-tenant read with a status + ownership-field oracle."\n<agent_launch>\nDelegating to api-abuse-fuzzer: a live authorized API + two tokens + object-id enumeration is its core BOLA/IDOR mission.\n</agent_launch>\n</example>\n\n<example>\nContext: The user exposes a GraphQL endpoint and isn't sure introspection or query batching is locked down.\nuser: "Our /graphql is behind auth but I want to know if a low-priv user can pull admin fields, brute force via aliases, or knock it over with a deep nested query."\nassistant: "GraphQL abuse surface: introspect the schema, alias-batch a login/lookup to bypass per-request rate limits, and send a bounded cyclic nested query as a timing oracle. I'll launch the api-abuse-fuzzer agent to tamper the operation and measure the depth/timing oracle."\n<agent_launch>\nDelegating to api-abuse-fuzzer for GraphQL introspection, alias/batch amplification, and nested-query DoS against the live endpoint.\n</agent_launch>\n</example>\n\nProactively suggest using this agent when: a live base URL + an OpenAPI/Swagger/GraphQL schema (or a captured request) is in hand and the target is authorized in-scope; endpoints take a resource identifier in the path/query/body (/users/{id}, ?account=, {"order_id": ...}) — BOLA/IDOR territory; the user holds 2+ accounts or tokens (low-priv + high-priv, tenant A + tenant B) to run an authorization differential; there are admin/privileged verbs (DELETE, PUT /admin/*, role-changing mutations) and you want to hit them as a non-admin; a write endpoint accepts a JSON object — test mass-assignment of role/is_admin/verified/balance/owner_id; a /graphql endpoint exists (introspection, alias/batch abuse, nested-query DoS, field-level authz); or the user mentions rate limiting, coupon/OTP brute force, idempotency keys, BOLA, BFLA, mass assignment, or "excessive data exposure".

assumption-pressure-testSubagent

Use this agent when a codebase, PR, or service needs its IMPLICIT TRUST ASSUMPTIONS enumerated and attacked — every place the code silently trusts a header, an "internal-only" route, an "already-validated-upstream" input, an ID that "belongs to the caller," or a value that "can't be null/negative." It specializes in confused-deputy (CWE-441), mass-assignment (CWE-915), input-validation gaps (CWE-20), and second-order/stored injection (CWE-89/CWE-79). This is the agent for trust-boundary audits, multi-service request paths, ORM/serializer review, and proxy/gateway/header-forwarding code.\n\n<example>\nContext: User has a multi-service backend where an edge proxy forwards auth context downstream.\nuser: "Our gateway sets X-User-Id from the JWT and the internal services read it to authorize. Can you check this?"\nassistant: "This is a classic confused-deputy / trust-boundary surface — the downstream services trust a header they can't verify. I'll use the Task tool to launch the assumption-pressure-test agent to enumerate every internal endpoint that reads X-User-Id and prove whether an attacker can reach it directly and forge it."\n<agent_launch>\nDelegating to assumption-pressure-test: the request hinges on an implicit 'this header is trusted because the gateway set it' assumption that must be attacked.\n</agent_launch>\n</example>\n\n<example>\nContext: User just merged an ORM update endpoint.\nuser: "Here's the new profile update handler: User.objects.filter(id=request.user.id).update(**request.data)"\nassistant: "That `**request.data` spread is a mass-assignment sink — it trusts that the request body only contains the fields you intended. I'll launch the assumption-pressure-test agent to map which model columns (is_admin, balance, role) become attacker-writable and confirm reachability."\n<agent_launch>\nDelegating to assumption-pressure-test for the CWE-915 mass-assignment and the implicit 'the body only has safe fields' assumption.\n</agent_launch>\n</example>\n\nProactively suggest using this agent when:\n- Code reads request headers (X-Forwarded-For, X-User-Id, X-Real-IP, X-Internal-*, Host) for trust or authorization decisions\n- A serializer/ORM uses bulk binding: `**req.body`, `Object.assign`, `ModelMapper`, `BeanUtils.copyProperties`, `update_attributes`, `params.permit!`\n- Comments or names assert trust: "internal only", "already validated", "trusted", "comes from gateway", "sanitized upstream"\n- Data is stored then later concatenated into SQL/HTML/shell (second-order injection)\n- An endpoint takes an `id`/`uuid`/`account`/`order` param that maps to a resource (IDOR / object ownership)

coverage-analyzerSubagent

Generate gcov coverage data for a code repository.

crash-analysis-agentSubagent

Analyze security bugs from any C/C++ project with full root-cause tracing

crash-analyzerSubagent

Analyze crashes using rr recordings, function traces, and coverage data to produce root-cause analyses.

crash-analysis-checkerSubagent

Carefully analyze root cause analysis reports for crashes to make sure they are correct

exploitability-validator-agentSubagent

Multi-stage pipeline to validate vulnerability findings are real, reachable, and exploitable

federated-identity-breakerSubagent

|