red-team
This red-team skill performs adversarial validation by adopting constrained personas to attempt real tasks against documentation and skills before deployment, identifying breaks and gaps that expert review alone might miss. Use it to probe docs, skills, plans, or claims for weaknesses, unstated assumptions, and practical usability issues by simulating restricted-context agent interactions rather than expert judgment.
git clone --depth 1 https://github.com/boshu2/agentops /tmp/red-team && cp -r /tmp/red-team/images/gemini/skills/red-team ~/.claude/skills/red-teamSKILL.md
# /red-team — Persona-Based Adversarial Validation
> **Quick Ref:** Adopt constrained personas. Attempt real tasks. Report what breaks. Unlike `/council` (expert judgment) or `/vibe` (code quality), red-team tests whether things actually WORK when someone TRIES to use them.
**YOU MUST EXECUTE THIS WORKFLOW. Do not just describe it.**
## Quick Start
```bash
/red-team docs/ # probe docs with default personas
/red-team skills/council/ # probe a skill's SKILL.md
/red-team --surface=docs README.md # explicit surface type
/red-team --personas-file=.agents/red-team/p.yaml # custom personas
/red-team --deep skills/rpi/ # council consolidation with --deep
```
---
## How It Works
```
Council: expert judges → review artifact → debate → verdict
Red-team: constrained agents → attempt task → collect findings → council consolidates
```
Council judges SEE everything and JUDGE quality. Red-team agents have RESTRICTED context and ATTEMPT tasks. Council is reused only for the consolidation/verdict phase.
---
## Flags
| Flag | Default | Description |
|------|---------|-------------|
| `--surface=<type>` | auto-detect | Force surface type: `docs` or `skills` |
| `--personas-file=<path>` | built-in | Custom persona definitions (YAML) |
| `--scenarios-file=<path>` | auto-generate | Custom scenario definitions (YAML) |
| `--deep` | off | Use full council (not --quick) for consolidation |
| `--persona=<name>` | all | Run only a specific persona |
| `--target=<path>` | `.` | Target path to probe |
---
## Execution Steps
### Step 0: Setup
Detect target surface type and create output directory.
```bash
mkdir -p .agents/red-team
```
**Surface detection:**
- Path contains `skills/` and a `SKILL.md` exists → `skills` surface
- Path contains `docs/` or target is `README.md` → `docs` surface
- Explicit `--surface=<type>` overrides auto-detection
**Validate surface:** v1 supports `docs` and `skills` only. If another surface is detected, output:
```
Surface '<type>' is not supported in v1. Supported: docs, skills.
```
### Step 1: Load Personas
**Priority order:**
1. `--personas-file=<path>` → load custom personas from YAML
2. `.agents/red-team/personas/*.yaml` → load project-specific personas
3. Built-in defaults from council `red-team` preset (see [references/persona-format.md](references/persona-format.md))
**For docs surface:** Default personas: `panicked-sre`, `junior-engineer`, `first-time-consumer`
**For skills surface:** Default persona: `zero-context-agent`
If `--persona=<name>` is set, filter to only that persona.
### Step 2: Build Context-Restricted Prompts
For each persona, construct a context-restricted agent prompt. This is the critical step that differentiates red-team from council — the agent operates under enforced knowledge constraints.
**Prompt template:**
```
You are {PERSONA_NAME}: {ROLE}.
CONTEXT: {CONTEXT_DESCRIPTION}
MANDATORY CONSTRAINTS — you MUST follow these:
- You can ONLY read files in: {ALLOWED_PATHS}
- You do NOT know: {EXCLUDED_KNOWLEDGE}
- You CANNOT: {CANNOT_LIST}
- You MUST navigate from the entry point a real {ROLE} would use
- Do NOT use Grep to search the entire codebase — only read files
you would naturally discover by following links and references
YOUR TASK: Complete the following scenarios in order.
{SCENARIO_LIST}
For EACH scenario, record:
1. Steps taken (file read, link followed, search attempted)
2. Path taken: entry_point → file1:line → file2:line → ...
3. Verdict: PASS (completed), FAIL (blocked), PARTIAL (completed with friction)
4. Friction points (even on PASS — what slowed you down?)
5. Evidence: exact file:line references
6. Severity: critical (blocks task), significant (impedes task), minor (friction)
Write your complete findings report to: .agents/red-team/probe-{PERSONA_NAME}.md
Use this format for each finding:
## RT-NNN: <title>
- **Scenario:** <which scenario>
- **Verdict:** PASS | FAIL | PARTIAL
- **Severity:** critical | significant | minor
- **Path taken:** <navigation path>
- **Finding:** <what happened>
- **Evidence:** <file:line>
- **Recommendation:** <actionable fix>
```
**Context restriction enforcement:**
The persona's `constraints.allowed_paths` controls which files the agent can read. The `constraints.excluded_knowledge` tells the agent what concepts to treat as unknown. The `constraints.cannot` lists forbidden actions.
These constraints are enforced via the agent prompt — the agent is instructed to behave as if it only has access to the allowed paths and lacks the excluded knowledge. While not technically sandboxed, this produces meaningful usability findings because the agent genuinely navigates from the entry point rather than using expert knowledge to skip ahead.
### Step 3: Load Scenarios
**Priority order:**
1. `--scenarios-file=<path>` → load custom scenarios
2. `.agents/red-team/scenarios/*.yaml` → load project-specific scenarios
3. Auto-generate from target surface
**Auto-generation rules** per surface type — see [references/scenario-format.md](references/scenario-format.md):
- **Docs:** 4-6 scenarios per persona probing discoverability, completeness, copy-paste readiness, jargon
- **Skills:** 3-5 scenarios per persona probing step executability, examples, error handling, flags
### Step 4: Execute Probes
Spawn one agent per persona. Each agent runs all scenarios for their persona sequentially.
```
Agent(
description="Red-team probe: {persona_name}",
prompt=<context-restricted prompt from Step 2>,
subagent_type="general",
run_in_background=true
)
```
**Spawn all persona agents in parallel** (they work on independent probes).
**Wait for all agents to complete.** Each writes findings to `.agents/red-team/probe-{persona_name}.md`.
### Step 5: Collect and Normalize Findings
Read each probe report from `.agents/red-team/probe-{persona_name}.md`.
Parse findings into canonical `schemUse Agent Mail from Codex for file leases, notifications, inboxes, and conflict prevention.
>-
>-
Use when converting markdown plans into br beads with dependencies for implementation or swarm execution.
Use when switching AI coding CLI accounts quickly to recover from subscription rate limits or OAuth friction.
>-
Use when starting non-trivial work, mining lessons, or preventing repeated mistakes with cm procedural memory.
Mine past agent sessions for working prompts, decisions, and patterns. Use when "what did I ask?", "find that prompt", session archaeology, or agent history.