Skill389 estrellas del repoactualizado 19d ago

adversarial-roleplay

The adversarial-roleplay Claude Code skill systematically tests artifacts for vulnerabilities by constructing detailed hostile personas with specific motivations and expertise domains, then executing coordinated attacks from each persona's perspective while tracking successful attack vectors. Use this skill when conducting thorough security assessments or red-team evaluations that require identifying convergent weaknesses across multiple adversarial approaches and threat models.

Ver fuente Repositorio: de-anthropocentric-research-engine

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/yogsoth-ai/de-anthropocentric-research-engine /tmp/adversarial-roleplay && cp -r /tmp/adversarial-roleplay/skills/adversarial-roleplay ~/.claude/skills/adversarial-roleplay

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

# Adversarial Roleplay Tactic

Deploy constructed hostile personas to attack the artifact from distinct motivational frames.

## Orchestration

1. **persona-construction** builds detailed adversary profile:
   - Background and expertise domain
   - Motivation for attacking (career incentive, resource competition, ideological)
   - Known blind spots and biases of this persona type
   - Preferred attack patterns
2. **attack-vector-generation** generates vectors specific to persona's expertise and motivation
3. **probe-execution** executes attacks while maintaining persona consistency
4. Successful attack paths recorded with persona attribution
5. Process repeats for each persona (budget-limited)
6. **finding-aggregation** cross-references findings across personas for convergent vulnerabilities

## Subagents Dispatched

- persona-construction (1 call per persona)
- attack-vector-generation (1 call per persona)
- probe-execution (N calls per persona, budget-limited)
- finding-aggregation (1 call at end, cross-persona)

## Termination Conditions

- All budgeted personas deployed and exhausted
- Convergent vulnerability found by 2+ personas (high-confidence finding)
- Single persona finds critical vulnerability (early report)
- Budget exhausted (report per-persona findings separately)

<!-- BEGIN available-tables (generated) -->

## Available SOPs

Optional, no fixed order; the final leaf is always a sop.

| SOP | When to use |
| --- | --- |
| attack-vector-generation | Generate specific attack strategies for a given threat surface, producing concrete probes that can be executed. |
| finding-aggregation | Aggregate, deduplicate, and classify findings from multiple probes into a coherent vulnerability report. |
| persona-construction | Build a detailed adversarial persona with background, motivation, expertise, blind spots, and preferred attack patterns. |
| probe-execution | Execute a single attack probe against an artifact, record the result with evidence and severity classification. |

<!-- END available-tables (generated) -->

Del mismo repositorio

formated-resultSkill

Experiment-specific - summarize the DARE executor's research design into a clean research_result report, forced to write back into the spec file produced by formated-specs.

formated-specsSkill

Experiment-specific - replaces writing-specs, emits DARE's 4-layer call plan as a clean research_graph schema. Last step forces load formated-result.

injection-fidelitySkill

loss-1 judge - read a sample's full dialogue and decide whether the user simulator semantically enacted its Policy Card. check-blind.

ladder-quality-orderSkill

loss-2 judge - pairwise quality comparison across the n rungs within one topic; decide monotonicity and endpoint separation. check-blind, D1-D5 only.

abductive-hypothesis-generationSkill

Strategy: Inference to the best explanation in the face of anomalies

ablation-brainstormSkill

Remove components one by one, observe system changes to reveal hidden

ablation-component-mappingSkill

Map system architecture to ablatable units for ablation studies

ablation-designSkill

Design ablation studies to isolate component contributions in ML systems