security-research
The security-research skill orchestrates a parallel vulnerability audit using team mode, assigning three specialized hunters to map attack surfaces, identify auth and injection flaws, and detect runtime and supply-chain risks, then tasking two PoC engineers to validate findings with reproducible exploits. Use this skill when auditing a repository, pull request, release candidate, or specific code path for real exploitability rather than generic security concerns, following OWASP and CVSS standards to separate credible threats from theoretical issues.
git clone --depth 1 https://github.com/code-yeongyu/oh-my-openagent /tmp/security-research && cp -r /tmp/security-research/packages/omo-opencode/src/features/builtin-skills/security-research ~/.claude/skills/security-researchSKILL.md
# Security Research - Team Mode Vulnerability Audit
Use this skill to run a parallel security audit that separates real exploitability from generic concern. The team has 3 vulnerability hunters and 2 PoC engineers.
## Hard Preconditions
Before starting, verify:
1. `team_*` tools are available. If not, stop and tell the user:
`security-research requires team-mode. Set team_mode.enabled: true in your oh-my-openagent config, restart opencode, then retry.`
2. You are in the main session, not a background subagent.
3. You have a concrete target: repository, diff range, PR, release candidate, path list, or threat surface.
If the user provided no target, audit the current repository and current branch diff against its upstream or merge base. If there is no diff, audit the security-sensitive surfaces in the working tree.
## Severity Standard
Use these references as the scoring frame:
- CWE for root-cause weakness classification: https://cwe.mitre.org/
- OWASP WSTG for test methodology: https://devguide.owasp.org/en/06-verification/01-guides/01-wstg/
- OWASP ASVS for control verification: https://owasp.org/www-project-application-security-verification-standard/
- CVSS v4.0 for exploitability and impact scoring: https://www.first.org/cvss/v4.0/specification-document
Rules:
- No severity without an attack path.
- No critical or high finding without concrete exploit preconditions and impact.
- Keep CWE category separate from severity.
- Prefer a small, reproducible PoC over theoretical language.
- Never run destructive exploits against real services or third-party systems.
- Use local fixtures, toy payloads, dry runs, or static proof when real execution would be unsafe.
## Team Roster
Create one Team Mode run with these 5 members:
| Member | Kind | Category | Role |
|--------|------|----------|------|
| `surface-hunter` | category | `deep` | Map entry points, trust boundaries, and reachable attack surfaces. |
| `auth-data-hunter` | category | `ultrabrain` | Hunt auth, authorization, data isolation, injection, and secret handling flaws. |
| `runtime-supply-hunter` | category | `unspecified-high` | Hunt filesystem, subprocess, archive, dependency, hook, MCP, and config risks. |
| `poc-engineer-a` | category | `unspecified-high` | Build minimal PoCs for the strongest candidate findings. |
| `poc-engineer-b` | category | `deep` | Independently reproduce, falsify, or downgrade candidate findings. |
Call `team_create` with an inline spec:
```typescript
team_create({
inline_spec: {
name: "security-research",
description: "Parallel exploitability-driven security research team.",
members: [
{
name: "surface-hunter",
kind: "category",
category: "deep",
prompt: "You map attack surface. Enumerate entry points, trust boundaries, attacker-controlled inputs, data sinks, privilege transitions, and sensitive assets. Return evidence with file paths and exact functions. Do not assign severity unless you can name an attack path."
},
{
name: "auth-data-hunter",
kind: "category",
category: "ultrabrain",
prompt: "You hunt auth, authorization, tenant/data isolation, injection, SSRF, credential exposure, and confused-deputy flaws. Reason from attacker capability to impact. Return only findings with concrete exploit preconditions, CWE candidates, and verification steps."
},
{
name: "runtime-supply-hunter",
kind: "category",
category: "unspecified-high",
prompt: "You hunt filesystem, subprocess, archive extraction, dependency, hook execution, MCP, config, and environment-variable risks. Check path traversal, command injection, unsafe downloads, permission boundaries, and supply-chain assumptions. Cite file paths and commands used."
},
{
name: "poc-engineer-a",
kind: "category",
category: "unspecified-high",
prompt: "You build minimal safe PoCs for candidate findings. Use toy inputs and local-only execution. Your job is to prove or disprove exploitability, not to broaden scope. Report exact reproduction steps and expected output."
},
{
name: "poc-engineer-b",
kind: "category",
category: "deep",
prompt: "You independently reproduce candidate findings and try to falsify them. Downgrade anything without a working path. If a PoC is unsafe to run, design a safe static or dry-run proof and explain the limit."
}
]
}
})
```
If a category is unavailable, retry once by replacing only that category with `unspecified-high`. Do not reduce the team below 5 members.
## Workflow
### Phase 0: Scope and Baseline
Collect:
- Target scope and reason for audit.
- Branch, base ref, diff, and changed files if this is a change review.
- Security-sensitive directories and files if this is a full-repo audit.
- Existing tests and commands that exercise relevant surfaces.
- Any user-stated constraints, such as no network calls or no destructive tests.
Use `rg`, `git diff`, `git log`, LSP, and existing tests before assigning work.
### Phase 1: Independent Hunter Pass
Send one prompt to the 3 hunters:
```text
Audit target:
{target summary}
Context:
{diff, file list, security-sensitive paths, known constraints}
Task:
Find candidate vulnerabilities in your assigned role. For each candidate include:
- title
- affected file/function
- attacker capability
- attack path
- impact
- CWE candidate
- exact evidence
- safe verification idea
Reject generic hardening advice. Return only candidates with a plausible path.
```
Wait for all hunters.
### Phase 2: PoC Pass
Deduplicate hunter candidates. Send the strongest candidates to both PoC engineers.
Each PoC engineer must return:
- Reproduced, falsified, or unsafe-to-run.
- Exact commands, fixtures, or static proof.
- Observed output or reason it fails.
- Severity recommendation using exploitability and impact.
- Downgrade rationale for anything notCompare HEAD with the latest published npm versions and list all unpublished changes by release layer. Triggers: unpublished changes, changelog, what changed, whats new.
Read-only GitHub triage for issues AND PRs. 1 item = 1 background task (category: quick). Analyzes all open items and writes evidence-backed reports to /tmp/{datetime}/. Every claim requires a GitHub permalink as proof. NEVER takes any action on GitHub - no comments, no merges, no closes, no labels. Reports only. Triggers: 'triage', 'triage issues', 'triage PRs', 'github triage'.
Adversarial multi-agent planning skill. Self-orchestrates 5 hostile category members (unspecified-low, unspecified-high, deep, ultrabrain, artistry) via team-mode for ruthless cross-critique debate, distills only the defensible insights, then MANDATORILY hands the distilled insight bundle to the `plan` agent for executable plan formalization. Use when planning needs maximum rigor and surfacing of weak assumptions, blind spots, and over-engineering. Triggers: 'hyperplan', 'hpp', '/hyperplan', 'adversarial plan', 'hostile planning', 'cross-critique plan', '하이퍼플랜', '적대적 계획', '교차 비평'.
Easter egg command - about oh-my-opencode. Triggers: omomomo, about, easter egg.
QA opencode itself, per case: verify the CLI/terminal (opencode run, db, serve, export), prove a specific plugin hook/action/event fired via the SSE event stream, smoke-test the TUI under tmux, and investigate sessions in opencode's SQLite DB by id, title/name, or message text. Ships tested helper scripts (each with a --self-test) plus per-domain references. Use whenever someone wants to QA, smoke-test, verify, or debug opencode's CLI, HTTP server, plugin hooks/events, or TUI, or to find/inspect opencode sessions in the database. Triggers: opencode qa, qa opencode, test opencode, verify opencode hook, opencode session db, find opencode session by id/name/text, opencode tui test, opencode server health, opencode event stream.
Nuclear-grade 16-agent pre-publish release gate. Runs /get-unpublished-changes to detect all changes since last npm release, spawns up to 10 ultrabrain agents for deep per-change analysis, invokes /review-work (5 agents) for holistic review, and 1 oracle for overall release synthesis. Use before EVERY npm publish. Triggers: 'pre-publish review', 'review before publish', 'release review', 'pre-release review', 'ready to publish?', 'can I publish?', 'pre-publish', 'safe to publish', 'publishing review', 'pre-publish check'.
Publish oh-my-opencode to npm via GitHub Actions workflow. Argument: <patch|minor|major>. Triggers: publish, release, deploy, npm publish.
Remove unused code from this project with ultrawork mode, LSP-verified safety, atomic commits. Triggers: remove dead code, dead code, cleanup, remove unused.