Skip to main content
ClaudeWave
Skill304 repo starsupdated 2d ago

commands

This Claude Code skill guides users through documenting and saving successful custom security testing approaches as reusable specialist skills. Use it after developing a working custom methodology such as a focused analysis order, specialized technique, or domain-specific testing pattern to capture generalizable patterns, define trigger conditions, extract reusable tool combinations, validate token efficiency, and save the skill to the proper `.claude/skills/` directory for automatic loading in future sessions.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/deonmenezes/mantishack /tmp/commands && cp -r /tmp/commands/.claude/commands/mantis-create- ~/.claude/skills/commands
Then start a new Claude Code session; the skill loads automatically.

mantis-create-skill.md

# /create-skill - Save Custom Approach as Reusable Skill

Save a successful custom approach as a reusable specialist skill.

## When to Use

After you've helped the user with a custom security testing approach:
- Custom analysis focus (e.g., "focus on API security only")
- Custom priority order (e.g., "check auth before secrets")
- Custom techniques (e.g., "specific testing methodology")
- Successful findings (approach actually worked)

---

## What This Does

Guides skill creation process:

### Step 1: Capture Successful Approach

```
What was successful about this approach?

Examples:
- Custom priorities: Auth → API security → Business logic
- Specific focus: API endpoint authentication testing
- Custom technique: Token generation + endpoint fuzzing
- Domain expertise: Mobile app security patterns
```

### Step 2: Define Skill Parameters

```
Skill name: [descriptive_name]
Trigger keywords: [when should this auto-load?]
Domain: [what type of targets?]

Examples:
- Name: api_security_auth_focus
- Keywords: API, REST, authentication, admin panel
- Domain: Web APIs with authentication
```

### Step 3: Extract Reusable Patterns

Review approach for:
- ✓ Generalizable patterns (not hardcoded to one target)
- ✓ Reusable priorities (applicable to similar targets)
- ✓ Tool combinations (what worked together)
- ✗ Target-specific details (remove these)

### Step 4: Validate Token Budget

```
Skill size: ___ tokens (must be <500 tokens)
Current total skills: ___ (warn if >10 skills)
Session impact: +___ tokens when auto-loaded
```

### Step 5: Create Skill File

Save to: `.claude/skills/[skill_name]/SKILL.md` (the
canonical Claude Code skills directory). For
single-file skills, `.claude/skills/[skill_name].md`
also works — Claude Code's skill discovery walks
`.claude/skills/` and picks up both layouts.

(Pre-fix this said `tiers/specialists/custom/[name].md`.
That path is from an earlier MANTISHACK layout that
predates Claude Code's skills mechanism — files there
aren't auto-loaded by Claude Code today, so saving
new skills there would have made them invisible to
the LLM. Existing files under `tiers/personas/` and
`tiers/specialists/` are still loaded VIA EXPLICIT
prompts, but new skills should go under
`.claude/skills/` for auto-load.)

**Skill structure:**
```markdown
# [Skill Name]
# Created: [date]
# Source: Successful approach from [session]
# Token cost: [X] tokens
# Auto-loads: [trigger keywords]

## Core Philosophy
[What makes this approach unique/successful]

## When to Use
[What types of targets/situations]

## Approach
[Successful priorities, techniques, tools]

## Integration with Python
[How this guides Python execution parameters]
```

### Step 6: Test Auto-Loading

Verify skill will auto-load correctly:
- Keywords defined clearly
- File in correct location
- Token budget acceptable

---

## Token Budget Guidelines

**Skill size limits:**
- Minimum: 200 tokens (enough for useful content)
- Recommended: 300-400 tokens (sweet spot)
- Maximum: 500 tokens (hard limit, enforced)

**Total skills warning:**
- Yellow (5 skills): 1,500-2,000 tokens in skills
- Red (10 skills): 3,000-4,000 tokens (approaching budget)
- Critical (15+ skills): Consider consolidating or removing unused

---

## Example Skill Creation

```
User: /create-skill

Claude: What successful approach should we save?

User: "We focused on API authentication, checking auth bypass before anything else, and it found critical issues faster"

Claude:
  ✓ Extracting approach patterns...

  Skill preview:
  - Name: api_auth_priority
  - Keywords: API, REST, authentication, auth bypass
  - Focus: Authentication issues in web APIs
  - Priority: Auth bypass → API security → Input validation
  - Size: 380 tokens ✓

  Create this skill? [Y/n/Customize]

User: Y

Claude: ✓ Created: .claude/skills/api_auth_priority/SKILL.md
        Will auto-load when keywords match: API, authentication

        Test it now? Run /scan on API codebase
```

---

## Quality Checks

Before saving skill:
- [ ] Not overfitted to one target (generalized patterns)
- [ ] Token limit respected (<500 tokens)
- [ ] Keywords defined (will auto-load correctly)
- [ ] Approach documented (clear priorities/techniques)
- [ ] Integration clear (how it guides Python parameters)

---

## Maintenance

Skills are stored in: `.claude/skills/`

**Manage skills:**
- List: `ls .claude/skills/`
- Disable: Add `.disabled` suffix to the skill directory or file
- Remove: Delete the skill directory (or the bare `.md` file)
- Edit: Modify the skill's `SKILL.md` (or single-file `.md`) directly

(Pre-fix this section pointed at `tiers/specialists/custom/` —
an earlier MANTISHACK layout that predates Claude Code's skills
mechanism. Files there aren't auto-loaded today, so saving
new skills there made them invisible to the LLM.)

**Quarterly review prompt** (if 5+ skills exist):
```
Review custom skills? Usage stats available.
```
api-abuse-fuzzerSubagent

Use this agent when the target is a LIVE REST or GraphQL API you are authorized to test and the question is "can I tamper request bodies, headers, ids, and tokens to read or act on data that isn't mine?" — active, request-driven abuse of the API contract, not static code review. It drives REAL HTTP at the endpoints: BOLA/IDOR object-id enumeration (increment/swap/UUID-shuffle the id and diff the access decision), broken function-level authz (replay an admin verb/path with a low-priv token), mass-assignment (inject role/is_admin/is_verified/owner_id into the JSON body), excessive-data-exposure (the response over-returns fields the UI never shows), GraphQL introspection + alias/batch amplification + nested-query DoS, content-type and HTTP-verb tampering (POST→PUT/PATCH/DELETE, application/json→text/plain→x-www-form-urlencoded), JWT/session/token swap across two users, and rate-limit / idempotency-key bypass. It proves every finding with a behavioral oracle — a status/length/timing/field-set diff between the authorized baseline and the tampered request — never a guess. Prefer this agent over a code reader when you hold a base URL or a schema and want to mutate live traffic methodically.\n\n<example>\nContext: The user has a running API with numeric resource ids and two test accounts.\nuser: "Here's our staging API at https://api.staging.acme.test and tokens for user A and user B — can user A read user B's orders?"\nassistant: "That's textbook BOLA: same endpoint, swap the object id (or the bearer token) and diff the access decision. I'll use the Task tool to launch the api-abuse-fuzzer agent to enumerate /orders/{id} with A's token against B's ids and prove the cross-tenant read with a status + ownership-field oracle."\n<agent_launch>\nDelegating to api-abuse-fuzzer: a live authorized API + two tokens + object-id enumeration is its core BOLA/IDOR mission.\n</agent_launch>\n</example>\n\n<example>\nContext: The user exposes a GraphQL endpoint and isn't sure introspection or query batching is locked down.\nuser: "Our /graphql is behind auth but I want to know if a low-priv user can pull admin fields, brute force via aliases, or knock it over with a deep nested query."\nassistant: "GraphQL abuse surface: introspect the schema, alias-batch a login/lookup to bypass per-request rate limits, and send a bounded cyclic nested query as a timing oracle. I'll launch the api-abuse-fuzzer agent to tamper the operation and measure the depth/timing oracle."\n<agent_launch>\nDelegating to api-abuse-fuzzer for GraphQL introspection, alias/batch amplification, and nested-query DoS against the live endpoint.\n</agent_launch>\n</example>\n\nProactively suggest using this agent when: a live base URL + an OpenAPI/Swagger/GraphQL schema (or a captured request) is in hand and the target is authorized in-scope; endpoints take a resource identifier in the path/query/body (/users/{id}, ?account=, {"order_id": ...}) — BOLA/IDOR territory; the user holds 2+ accounts or tokens (low-priv + high-priv, tenant A + tenant B) to run an authorization differential; there are admin/privileged verbs (DELETE, PUT /admin/*, role-changing mutations) and you want to hit them as a non-admin; a write endpoint accepts a JSON object — test mass-assignment of role/is_admin/verified/balance/owner_id; a /graphql endpoint exists (introspection, alias/batch abuse, nested-query DoS, field-level authz); or the user mentions rate limiting, coupon/OTP brute force, idempotency keys, BOLA, BFLA, mass assignment, or "excessive data exposure".

assumption-pressure-testSubagent

Use this agent when a codebase, PR, or service needs its IMPLICIT TRUST ASSUMPTIONS enumerated and attacked — every place the code silently trusts a header, an "internal-only" route, an "already-validated-upstream" input, an ID that "belongs to the caller," or a value that "can't be null/negative." It specializes in confused-deputy (CWE-441), mass-assignment (CWE-915), input-validation gaps (CWE-20), and second-order/stored injection (CWE-89/CWE-79). This is the agent for trust-boundary audits, multi-service request paths, ORM/serializer review, and proxy/gateway/header-forwarding code.\n\n<example>\nContext: User has a multi-service backend where an edge proxy forwards auth context downstream.\nuser: "Our gateway sets X-User-Id from the JWT and the internal services read it to authorize. Can you check this?"\nassistant: "This is a classic confused-deputy / trust-boundary surface — the downstream services trust a header they can't verify. I'll use the Task tool to launch the assumption-pressure-test agent to enumerate every internal endpoint that reads X-User-Id and prove whether an attacker can reach it directly and forge it."\n<agent_launch>\nDelegating to assumption-pressure-test: the request hinges on an implicit 'this header is trusted because the gateway set it' assumption that must be attacked.\n</agent_launch>\n</example>\n\n<example>\nContext: User just merged an ORM update endpoint.\nuser: "Here's the new profile update handler: User.objects.filter(id=request.user.id).update(**request.data)"\nassistant: "That `**request.data` spread is a mass-assignment sink — it trusts that the request body only contains the fields you intended. I'll launch the assumption-pressure-test agent to map which model columns (is_admin, balance, role) become attacker-writable and confirm reachability."\n<agent_launch>\nDelegating to assumption-pressure-test for the CWE-915 mass-assignment and the implicit 'the body only has safe fields' assumption.\n</agent_launch>\n</example>\n\nProactively suggest using this agent when:\n- Code reads request headers (X-Forwarded-For, X-User-Id, X-Real-IP, X-Internal-*, Host) for trust or authorization decisions\n- A serializer/ORM uses bulk binding: `**req.body`, `Object.assign`, `ModelMapper`, `BeanUtils.copyProperties`, `update_attributes`, `params.permit!`\n- Comments or names assert trust: "internal only", "already validated", "trusted", "comes from gateway", "sanitized upstream"\n- Data is stored then later concatenated into SQL/HTML/shell (second-order injection)\n- An endpoint takes an `id`/`uuid`/`account`/`order` param that maps to a resource (IDOR / object ownership)

coverage-analyzerSubagent

Generate gcov coverage data for a code repository.

crash-analysis-agentSubagent

Analyze security bugs from any C/C++ project with full root-cause tracing

crash-analyzerSubagent

Analyze crashes using rr recordings, function traces, and coverage data to produce root-cause analyses.

crash-analysis-checkerSubagent

Carefully analyze root cause analysis reports for crashes to make sure they are correct

exploitability-validator-agentSubagent

Multi-stage pipeline to validate vulnerability findings are real, reachable, and exploitable

federated-identity-breakerSubagent

|