commands
This Claude Code skill guides users through documenting and saving successful custom security testing approaches as reusable specialist skills. Use it after developing a working custom methodology such as a focused analysis order, specialized technique, or domain-specific testing pattern to capture generalizable patterns, define trigger conditions, extract reusable tool combinations, validate token efficiency, and save the skill to the proper `.claude/skills/` directory for automatic loading in future sessions.
git clone --depth 1 https://github.com/deonmenezes/mantishack /tmp/commands && cp -r /tmp/commands/.claude/commands/mantis-create- ~/.claude/skills/commandsmantis-create-skill.md
# /create-skill - Save Custom Approach as Reusable Skill
Save a successful custom approach as a reusable specialist skill.
## When to Use
After you've helped the user with a custom security testing approach:
- Custom analysis focus (e.g., "focus on API security only")
- Custom priority order (e.g., "check auth before secrets")
- Custom techniques (e.g., "specific testing methodology")
- Successful findings (approach actually worked)
---
## What This Does
Guides skill creation process:
### Step 1: Capture Successful Approach
```
What was successful about this approach?
Examples:
- Custom priorities: Auth → API security → Business logic
- Specific focus: API endpoint authentication testing
- Custom technique: Token generation + endpoint fuzzing
- Domain expertise: Mobile app security patterns
```
### Step 2: Define Skill Parameters
```
Skill name: [descriptive_name]
Trigger keywords: [when should this auto-load?]
Domain: [what type of targets?]
Examples:
- Name: api_security_auth_focus
- Keywords: API, REST, authentication, admin panel
- Domain: Web APIs with authentication
```
### Step 3: Extract Reusable Patterns
Review approach for:
- ✓ Generalizable patterns (not hardcoded to one target)
- ✓ Reusable priorities (applicable to similar targets)
- ✓ Tool combinations (what worked together)
- ✗ Target-specific details (remove these)
### Step 4: Validate Token Budget
```
Skill size: ___ tokens (must be <500 tokens)
Current total skills: ___ (warn if >10 skills)
Session impact: +___ tokens when auto-loaded
```
### Step 5: Create Skill File
Save to: `.claude/skills/[skill_name]/SKILL.md` (the
canonical Claude Code skills directory). For
single-file skills, `.claude/skills/[skill_name].md`
also works — Claude Code's skill discovery walks
`.claude/skills/` and picks up both layouts.
(Pre-fix this said `tiers/specialists/custom/[name].md`.
That path is from an earlier MANTISHACK layout that
predates Claude Code's skills mechanism — files there
aren't auto-loaded by Claude Code today, so saving
new skills there would have made them invisible to
the LLM. Existing files under `tiers/personas/` and
`tiers/specialists/` are still loaded VIA EXPLICIT
prompts, but new skills should go under
`.claude/skills/` for auto-load.)
**Skill structure:**
```markdown
# [Skill Name]
# Created: [date]
# Source: Successful approach from [session]
# Token cost: [X] tokens
# Auto-loads: [trigger keywords]
## Core Philosophy
[What makes this approach unique/successful]
## When to Use
[What types of targets/situations]
## Approach
[Successful priorities, techniques, tools]
## Integration with Python
[How this guides Python execution parameters]
```
### Step 6: Test Auto-Loading
Verify skill will auto-load correctly:
- Keywords defined clearly
- File in correct location
- Token budget acceptable
---
## Token Budget Guidelines
**Skill size limits:**
- Minimum: 200 tokens (enough for useful content)
- Recommended: 300-400 tokens (sweet spot)
- Maximum: 500 tokens (hard limit, enforced)
**Total skills warning:**
- Yellow (5 skills): 1,500-2,000 tokens in skills
- Red (10 skills): 3,000-4,000 tokens (approaching budget)
- Critical (15+ skills): Consider consolidating or removing unused
---
## Example Skill Creation
```
User: /create-skill
Claude: What successful approach should we save?
User: "We focused on API authentication, checking auth bypass before anything else, and it found critical issues faster"
Claude:
✓ Extracting approach patterns...
Skill preview:
- Name: api_auth_priority
- Keywords: API, REST, authentication, auth bypass
- Focus: Authentication issues in web APIs
- Priority: Auth bypass → API security → Input validation
- Size: 380 tokens ✓
Create this skill? [Y/n/Customize]
User: Y
Claude: ✓ Created: .claude/skills/api_auth_priority/SKILL.md
Will auto-load when keywords match: API, authentication
Test it now? Run /scan on API codebase
```
---
## Quality Checks
Before saving skill:
- [ ] Not overfitted to one target (generalized patterns)
- [ ] Token limit respected (<500 tokens)
- [ ] Keywords defined (will auto-load correctly)
- [ ] Approach documented (clear priorities/techniques)
- [ ] Integration clear (how it guides Python parameters)
---
## Maintenance
Skills are stored in: `.claude/skills/`
**Manage skills:**
- List: `ls .claude/skills/`
- Disable: Add `.disabled` suffix to the skill directory or file
- Remove: Delete the skill directory (or the bare `.md` file)
- Edit: Modify the skill's `SKILL.md` (or single-file `.md`) directly
(Pre-fix this section pointed at `tiers/specialists/custom/` —
an earlier MANTISHACK layout that predates Claude Code's skills
mechanism. Files there aren't auto-loaded today, so saving
new skills there made them invisible to the LLM.)
**Quarterly review prompt** (if 5+ skills exist):
```
Review custom skills? Usage stats available.
```Use this agent when the target is a LIVE REST or GraphQL API you are authorized to test and the question is "can I tamper request bodies, headers, ids, and tokens to read or act on data that isn't mine?" — active, request-driven abuse of the API contract, not static code review. It drives REAL HTTP at the endpoints: BOLA/IDOR object-id enumeration (increment/swap/UUID-shuffle the id and diff the access decision), broken function-level authz (replay an admin verb/path with a low-priv token), mass-assignment (inject role/is_admin/is_verified/owner_id into the JSON body), excessive-data-exposure (the response over-returns fields the UI never shows), GraphQL introspection + alias/batch amplification + nested-query DoS, content-type and HTTP-verb tampering (POST→PUT/PATCH/DELETE, application/json→text/plain→x-www-form-urlencoded), JWT/session/token swap across two users, and rate-limit / idempotency-key bypass. It proves every finding with a behavioral oracle — a status/length/timing/field-set diff between the authorized baseline and the tampered request — never a guess. Prefer this agent over a code reader when you hold a base URL or a schema and want to mutate live traffic methodically.\n\n<example>\nContext: The user has a running API with numeric resource ids and two test accounts.\nuser: "Here's our staging API at https://api.staging.acme.test and tokens for user A and user B — can user A read user B's orders?"\nassistant: "That's textbook BOLA: same endpoint, swap the object id (or the bearer token) and diff the access decision. I'll use the Task tool to launch the api-abuse-fuzzer agent to enumerate /orders/{id} with A's token against B's ids and prove the cross-tenant read with a status + ownership-field oracle."\n<agent_launch>\nDelegating to api-abuse-fuzzer: a live authorized API + two tokens + object-id enumeration is its core BOLA/IDOR mission.\n</agent_launch>\n</example>\n\n<example>\nContext: The user exposes a GraphQL endpoint and isn't sure introspection or query batching is locked down.\nuser: "Our /graphql is behind auth but I want to know if a low-priv user can pull admin fields, brute force via aliases, or knock it over with a deep nested query."\nassistant: "GraphQL abuse surface: introspect the schema, alias-batch a login/lookup to bypass per-request rate limits, and send a bounded cyclic nested query as a timing oracle. I'll launch the api-abuse-fuzzer agent to tamper the operation and measure the depth/timing oracle."\n<agent_launch>\nDelegating to api-abuse-fuzzer for GraphQL introspection, alias/batch amplification, and nested-query DoS against the live endpoint.\n</agent_launch>\n</example>\n\nProactively suggest using this agent when: a live base URL + an OpenAPI/Swagger/GraphQL schema (or a captured request) is in hand and the target is authorized in-scope; endpoints take a resource identifier in the path/query/body (/users/{id}, ?account=, {"order_id": ...}) — BOLA/IDOR territory; the user holds 2+ accounts or tokens (low-priv + high-priv, tenant A + tenant B) to run an authorization differential; there are admin/privileged verbs (DELETE, PUT /admin/*, role-changing mutations) and you want to hit them as a non-admin; a write endpoint accepts a JSON object — test mass-assignment of role/is_admin/verified/balance/owner_id; a /graphql endpoint exists (introspection, alias/batch abuse, nested-query DoS, field-level authz); or the user mentions rate limiting, coupon/OTP brute force, idempotency keys, BOLA, BFLA, mass assignment, or "excessive data exposure".
Use this agent when a codebase, PR, or service needs its IMPLICIT TRUST ASSUMPTIONS enumerated and attacked — every place the code silently trusts a header, an "internal-only" route, an "already-validated-upstream" input, an ID that "belongs to the caller," or a value that "can't be null/negative." It specializes in confused-deputy (CWE-441), mass-assignment (CWE-915), input-validation gaps (CWE-20), and second-order/stored injection (CWE-89/CWE-79). This is the agent for trust-boundary audits, multi-service request paths, ORM/serializer review, and proxy/gateway/header-forwarding code.\n\n<example>\nContext: User has a multi-service backend where an edge proxy forwards auth context downstream.\nuser: "Our gateway sets X-User-Id from the JWT and the internal services read it to authorize. Can you check this?"\nassistant: "This is a classic confused-deputy / trust-boundary surface — the downstream services trust a header they can't verify. I'll use the Task tool to launch the assumption-pressure-test agent to enumerate every internal endpoint that reads X-User-Id and prove whether an attacker can reach it directly and forge it."\n<agent_launch>\nDelegating to assumption-pressure-test: the request hinges on an implicit 'this header is trusted because the gateway set it' assumption that must be attacked.\n</agent_launch>\n</example>\n\n<example>\nContext: User just merged an ORM update endpoint.\nuser: "Here's the new profile update handler: User.objects.filter(id=request.user.id).update(**request.data)"\nassistant: "That `**request.data` spread is a mass-assignment sink — it trusts that the request body only contains the fields you intended. I'll launch the assumption-pressure-test agent to map which model columns (is_admin, balance, role) become attacker-writable and confirm reachability."\n<agent_launch>\nDelegating to assumption-pressure-test for the CWE-915 mass-assignment and the implicit 'the body only has safe fields' assumption.\n</agent_launch>\n</example>\n\nProactively suggest using this agent when:\n- Code reads request headers (X-Forwarded-For, X-User-Id, X-Real-IP, X-Internal-*, Host) for trust or authorization decisions\n- A serializer/ORM uses bulk binding: `**req.body`, `Object.assign`, `ModelMapper`, `BeanUtils.copyProperties`, `update_attributes`, `params.permit!`\n- Comments or names assert trust: "internal only", "already validated", "trusted", "comes from gateway", "sanitized upstream"\n- Data is stored then later concatenated into SQL/HTML/shell (second-order injection)\n- An endpoint takes an `id`/`uuid`/`account`/`order` param that maps to a resource (IDOR / object ownership)
Generate gcov coverage data for a code repository.
Analyze security bugs from any C/C++ project with full root-cause tracing
Analyze crashes using rr recordings, function traces, and coverage data to produce root-cause analyses.
Carefully analyze root cause analysis reports for crashes to make sure they are correct
Multi-stage pipeline to validate vulnerability findings are real, reachable, and exploitable
|