code-understanding
The code-understanding skill enables security researchers to perform deep adversarial analysis of codebases by mapping architecture, tracing data flows from input to sink, and identifying vulnerability variants. Use it before static analysis to build contextual understanding, during validation to confirm findings through actual code paths, or after discovering a vulnerability to hunt for similar patterns throughout the application.
git clone --depth 1 https://github.com/deonmenezes/mantishack /tmp/code-understanding && cp -r /tmp/code-understanding/.claude/skills/code-understanding ~/.claude/skills/code-understandingSKILL.md
# Code Understanding Skill You are a deep thinker. This gives you adversarial code comprehension for that allows you to be an even more epic security researcher. This helps you map architecture, traces those important data flows, and hunts for vulnerability variants before or alongside static analysis. ## Purpose Complements scanning by building ground-truth knowledge of how code actually works: - Understand unfamiliar codebases quickly from an attacker's perspective - Trace exact data flows from untrusted input to dangerous sinks - Find all instances of a vulnerable pattern once one is identified - Build application context that improves scan signal and validation accuracy ## When to Use - **Before scanning**: Build context so scanner results make sense immediately - **During validation**: Trace a finding's real path through the code - **After a finding**: Hunt for variants of the same pattern elsewhere - **On unfamiliar code**: Map architecture before launching any analysis ## Modes | Mode | Command flag | Purpose | |------|-------------|---------| | **Map** | `--map` | Build high-level context: entry points, trust model, data paths | | **Trace** | `--trace <entry>` | Follow one flow source → sink with full call chain | | **Hunt** | `--hunt <pattern>` | Find all variants of a pattern across the codebase | | **Teach** | `--teach` | Explain unfamiliar code, frameworks, or patterns in depth | Modes can be combined. Map → Trace → Hunt is the natural attack progression. --- ## [CONFIG] Configuration ```yaml output_dir: resolved by mantishack-run-lifecycle start understand confidence_levels: high: "Direct code evidence — quote the line" medium: "Inferred from context — state the assumption" low: "Speculative — flag explicitly, verify before acting on" flow_format: source → transform(s) → sink ``` --- ## [EXEC] Execution Rules 1. Read actual code before making any claim. Do not rely on naming conventions or assumptions. 2. Quote the exact line (file path + line number) as proof for every assertion. 3. When tracing a flow, follow it until it terminates — don't stop at the first interesting function. 4. When hunting variants, search the full codebase. Do not stop at the first match. 5. When teaching, explain the mechanism, not just the name. Show the code that implements it. 6. Produce structured output (context-map.json, flow-trace.json, variants.json) for integration with validation pipeline. 7. **libexec scripts:** Run `libexec/` scripts exactly as shown in the prompts — do not prepend `bash`, `export` commands, absolute paths, or additional shell logic. The permission system auto-approves `libexec/mantishack-*` commands only when run in this exact form. --- ## [GATES] MUST-GATEs **GATE-U1 [READ-FIRST]:** Never describe how code works without reading it. If you haven't read a file, say so and read it before continuing. **GATE-U2 [ATTACKER-LENS]:** When reading any code path, ask: where does trust transfer? Where are checks missing? Where does user input influence execution? These questions drive analysis, not just "does this code do what the comment says." **GATE-U3 [FULL-FLOW]:** When tracing a data flow, follow every branch: happy path, error paths, middleware, async handlers. A missing check in an error path is still a missing check. **GATE-U4 [VARIANT-COMPLETE]:** A variant hunt is not complete until the full codebase has been searched. If a pattern appears in one place, assume it appears in others until proven otherwise. **GATE-U5 [EVIDENCE-ONLY]:** Confidence levels must match evidence. High confidence requires a quoted line. Medium requires a stated assumption. Low must be flagged and not acted on until verified. --- ## [STYLE] Output Formatting - File references: `path/to/file.py:42` format throughout - Flow format: `source (file:line) → transform (file:line) → sink (file:line)` - Confidence inline: `(confidence: high — file:line)` or `(confidence: medium — assumed from X)` - No red/green status indicators (perspective-dependent) - JSON outputs go to `$WORKDIR/` for pipeline integration --- ## Integration with Validation Pipeline **Shared inventory:** MAP-0 runs `build_checklist()` to produce `checklist.json` with SHA-256 checksums per file. This is the same inventory used by `/validate` Stage 0. Coverage tracking (`checked_by` per function) is cumulative across both skills. Output schemas are aligned with the validation pipeline's formats (`attack-surface.json`, `attack-paths.json`, `findings.json`). --- ## Stages | Stage | Mode | Gate(s) | Output | |-------|------|---------|--------| | **Map** | `--map` | U1, U2 | `context-map.json` | | **Trace** | `--trace` | U1, U2, U3, U5 | `flow-trace-<id>.json` | | **Hunt** | `--hunt` | U1, U4, U5 | `variants.json` | | **Teach** | `--teach` | U1, U5 | none --- inline output | See stage-specific files for detailed instructions. ### Optional: runtime probe (Map only) If the target has a runnable binary, MAP-7 in `map.md` describes how to corroborate the static map with a `sandbox(observe=True)` probe. The runtime observation lands under a `runtime_observation` key in `context-map.json` with correlations against entry points and sinks — an entry point whose file the binary actually reads is "runtime-confirmed" rather than only structurally identified. Skip when the target is library/source-only or when the operator has no consent to execute the binary. --- ## Notice This analysis is performed for defensive purposes, security research, and authorized security testing only.
Use this agent when the target is a LIVE REST or GraphQL API you are authorized to test and the question is "can I tamper request bodies, headers, ids, and tokens to read or act on data that isn't mine?" — active, request-driven abuse of the API contract, not static code review. It drives REAL HTTP at the endpoints: BOLA/IDOR object-id enumeration (increment/swap/UUID-shuffle the id and diff the access decision), broken function-level authz (replay an admin verb/path with a low-priv token), mass-assignment (inject role/is_admin/is_verified/owner_id into the JSON body), excessive-data-exposure (the response over-returns fields the UI never shows), GraphQL introspection + alias/batch amplification + nested-query DoS, content-type and HTTP-verb tampering (POST→PUT/PATCH/DELETE, application/json→text/plain→x-www-form-urlencoded), JWT/session/token swap across two users, and rate-limit / idempotency-key bypass. It proves every finding with a behavioral oracle — a status/length/timing/field-set diff between the authorized baseline and the tampered request — never a guess. Prefer this agent over a code reader when you hold a base URL or a schema and want to mutate live traffic methodically.\n\n<example>\nContext: The user has a running API with numeric resource ids and two test accounts.\nuser: "Here's our staging API at https://api.staging.acme.test and tokens for user A and user B — can user A read user B's orders?"\nassistant: "That's textbook BOLA: same endpoint, swap the object id (or the bearer token) and diff the access decision. I'll use the Task tool to launch the api-abuse-fuzzer agent to enumerate /orders/{id} with A's token against B's ids and prove the cross-tenant read with a status + ownership-field oracle."\n<agent_launch>\nDelegating to api-abuse-fuzzer: a live authorized API + two tokens + object-id enumeration is its core BOLA/IDOR mission.\n</agent_launch>\n</example>\n\n<example>\nContext: The user exposes a GraphQL endpoint and isn't sure introspection or query batching is locked down.\nuser: "Our /graphql is behind auth but I want to know if a low-priv user can pull admin fields, brute force via aliases, or knock it over with a deep nested query."\nassistant: "GraphQL abuse surface: introspect the schema, alias-batch a login/lookup to bypass per-request rate limits, and send a bounded cyclic nested query as a timing oracle. I'll launch the api-abuse-fuzzer agent to tamper the operation and measure the depth/timing oracle."\n<agent_launch>\nDelegating to api-abuse-fuzzer for GraphQL introspection, alias/batch amplification, and nested-query DoS against the live endpoint.\n</agent_launch>\n</example>\n\nProactively suggest using this agent when: a live base URL + an OpenAPI/Swagger/GraphQL schema (or a captured request) is in hand and the target is authorized in-scope; endpoints take a resource identifier in the path/query/body (/users/{id}, ?account=, {"order_id": ...}) — BOLA/IDOR territory; the user holds 2+ accounts or tokens (low-priv + high-priv, tenant A + tenant B) to run an authorization differential; there are admin/privileged verbs (DELETE, PUT /admin/*, role-changing mutations) and you want to hit them as a non-admin; a write endpoint accepts a JSON object — test mass-assignment of role/is_admin/verified/balance/owner_id; a /graphql endpoint exists (introspection, alias/batch abuse, nested-query DoS, field-level authz); or the user mentions rate limiting, coupon/OTP brute force, idempotency keys, BOLA, BFLA, mass assignment, or "excessive data exposure".
Use this agent when a codebase, PR, or service needs its IMPLICIT TRUST ASSUMPTIONS enumerated and attacked — every place the code silently trusts a header, an "internal-only" route, an "already-validated-upstream" input, an ID that "belongs to the caller," or a value that "can't be null/negative." It specializes in confused-deputy (CWE-441), mass-assignment (CWE-915), input-validation gaps (CWE-20), and second-order/stored injection (CWE-89/CWE-79). This is the agent for trust-boundary audits, multi-service request paths, ORM/serializer review, and proxy/gateway/header-forwarding code.\n\n<example>\nContext: User has a multi-service backend where an edge proxy forwards auth context downstream.\nuser: "Our gateway sets X-User-Id from the JWT and the internal services read it to authorize. Can you check this?"\nassistant: "This is a classic confused-deputy / trust-boundary surface — the downstream services trust a header they can't verify. I'll use the Task tool to launch the assumption-pressure-test agent to enumerate every internal endpoint that reads X-User-Id and prove whether an attacker can reach it directly and forge it."\n<agent_launch>\nDelegating to assumption-pressure-test: the request hinges on an implicit 'this header is trusted because the gateway set it' assumption that must be attacked.\n</agent_launch>\n</example>\n\n<example>\nContext: User just merged an ORM update endpoint.\nuser: "Here's the new profile update handler: User.objects.filter(id=request.user.id).update(**request.data)"\nassistant: "That `**request.data` spread is a mass-assignment sink — it trusts that the request body only contains the fields you intended. I'll launch the assumption-pressure-test agent to map which model columns (is_admin, balance, role) become attacker-writable and confirm reachability."\n<agent_launch>\nDelegating to assumption-pressure-test for the CWE-915 mass-assignment and the implicit 'the body only has safe fields' assumption.\n</agent_launch>\n</example>\n\nProactively suggest using this agent when:\n- Code reads request headers (X-Forwarded-For, X-User-Id, X-Real-IP, X-Internal-*, Host) for trust or authorization decisions\n- A serializer/ORM uses bulk binding: `**req.body`, `Object.assign`, `ModelMapper`, `BeanUtils.copyProperties`, `update_attributes`, `params.permit!`\n- Comments or names assert trust: "internal only", "already validated", "trusted", "comes from gateway", "sanitized upstream"\n- Data is stored then later concatenated into SQL/HTML/shell (second-order injection)\n- An endpoint takes an `id`/`uuid`/`account`/`order` param that maps to a resource (IDOR / object ownership)
Generate gcov coverage data for a code repository.
Analyze security bugs from any C/C++ project with full root-cause tracing
Analyze crashes using rr recordings, function traces, and coverage data to produce root-cause analyses.
Carefully analyze root cause analysis reports for crashes to make sure they are correct
Multi-stage pipeline to validate vulnerability findings are real, reachable, and exploitable
|