challenging
Cross-context adversarial review for deliverables before shipping. Use when producing blog posts, technical recommendations, analysis briefs, code, or any artifact where accuracy matters more than speed. Triggers on "challenge this", "review before shipping", "adversarial pass", "stress test this".
git clone --depth 1 https://github.com/oaustegard/claude-skills /tmp/challenging && cp -r /tmp/challenging/challenging ~/.claude/skills/challengingSKILL.md
# Challenging — Adversarial Review
Adversarial review before shipping. Three paths, each with distinct trade-offs:
- **Subagent path** (Claude Code, primary). Native sub-Claude via the Task tool — zero API keys, fresh context window, same model. Best when available.
- **External API path** (claude.ai, Codex, headless). Gemini (cross-model + cross-context) or Anthropic API (cross-context). Costs an incremental API call but gives genuine outside perspective.
- **Self path** (any environment). The caller assistant inhabits the adversary persona in a dedicated response. Zero cost, retains full subject-matter context from the conversation. Weaker at catching same-session confabulations than fresh-context adversaries, stronger at catching local-convention and factual errors the artifact glosses over. Not a strict downgrade — a different failure-mode profile.
The `adversary='auto'` resolution (default) picks gemini → claude → self based on available credentials. Callers in Claude Code still use `prepare()` + Task tool explicitly (subagent is strictly better than self in that environment, and auto-detection of Claude Code is brittle).
Inspired by VDD (dollspace.gay) and Grainulation's anti-rationalization patterns. The `drill` helper adopts the 5 Whys pattern from Tim Kellogg's [open-strix writeup](https://timkellogg.me/blog/2026/04/14/forgetting). The self-path persona-inhabitation move is kin to generative-thinking's inversion — commit to the mode before evaluating.
## Profiles
Pick the profile matching your artifact. **Read only the profile you need** — each is self-contained with persona, anti-rationalization table, evaluation criteria, and adversary system prompt.
| Profile | Use For | Iteration strategy | File |
|---------|---------|--------------------|------|
| `prose` | Blog posts, essays, articles — generic prose competence | parallel replay | `references/prose.md` |
| `prose-register` | Prose with a *named voice signature* — fidelity check | parallel replay | `references/prose-register.md` |
| `analysis` | Research briefs, comparisons, synthesis | parallel replay | `references/analysis.md` |
| `code` | Scripts, implementations, PRs | parallel replay | `references/code.md` |
| `recommendation` | Technical decisions, architecture choices | parallel replay | `references/recommendation.md` |
| `drill` | 5 Whys on one finding from a review | sequential deepen | `references/drill.md` |
One engine, one surface, two iteration strategies. Review profiles iterate in *parallel replay* — each pass independent, novelty tracked for confabulation. Drill iterates in *sequential deepen* — each pass takes the chain so far and produces one more why-level until bedrock or max depth, followed by a synthesis pass that extracts root causes.
`prose` and `prose-register` are siblings, not redundant. `prose` evaluates generic prose competence (claims, logic, structure, performed insight) and is explicitly told not to comment on style. `prose-register` evaluates fidelity to a named voice signature passed via `voice=...` (positive markers + anti-patterns) and ignores generic competence. Run both passes on voiced prose — they catch different failure modes.
## Usage — Claude Code (subagent path, primary)
Two-step protocol: a Python helper builds the prompt, you spawn a subagent via the Task tool, then a parser turns its response into structured findings.
```python
import sys
sys.path.insert(0, '/mnt/skills/user/challenging/scripts')
from challenger import prepare, parse_response
job = prepare(
artifact=open('/home/claude/draft.md').read(),
profile='prose',
context='Blog post about RAG scaling laws',
)
```
Then invoke the Task tool — `subagent_type='general-purpose'`, `prompt=job['prompt']`, `description='Adversarial review (prose)'`. The subagent runs in a fresh context, applies the persona, and returns a JSON message. Pass that message text to the parser:
```python
result = parse_response(subagent_text)
print(result['verdict']) # SHIP | REVISE | RETHINK
print(result['findings']) # List of specific issues
print(result['strengths']) # What to preserve
```
**Why subagents (not the API):** no key, no network dependency, fresh context, and the same Claude that's reviewing your work is reviewing it again with no prior bias — but in a clean window. For cross-model diversity (genuinely different blind spots), use Gemini below.
### Voiced prose — `prose-register` (subagent path)
For prose written in a named voice, `prose` will miss register-specific failure modes by design (its persona is instructed not to comment on style). Use the `prose-register` profile with a `voice=...` signature instead — or in addition, since the two profiles target orthogonal failure modes.
```python
voice = """
Corvid voice signature.
Positive markers: short sentences when certainty is high, dry observations,
lead with the answer then context, no throat-clearing.
Anti-patterns: heroic-narrator framing, drama-line-breaks ("Then the part that
almost killed it." floating alone), cliche tells ("shot itself in the foot",
"footgun"), time-scale inflation ("a month ago" when it was yesterday),
performed-significance setups ("What's interesting about this is..."),
RTFM-performed-as-revelation, precious sentimentality in closings.
"""
job = prepare(
artifact=open('/home/claude/draft.md').read(),
profile='prose-register',
context='Blog post for muninn.austegard.com',
voice=voice,
)
# Task tool → parse_response() as usual.
```
`voice` is required for `prose-register` and rejected for every other profile — pass the signature once, fail loud if it leaks to the wrong call. The richer the signature (with both positive markers and explicit anti-patterns), the more precise the review. A thin signature returns `verdict: RETHINK` with a finding describing what to sharpen rather than a confident review against an ambiguous spec.
### Drill — 5 Whys on a systemic finding (subagent path)
Drill uses the same `prGitHub repository access in containerized environments using REST API and credential detection. Use when git clone fails, or when accessing private repos/writing files via API.
Securely manages API credentials for multiple providers (Anthropic Claude, Google Gemini, GitHub). Use when skills need to access stored API keys for external service invocations.
Guidance for asking clarifying questions when user requests are ambiguous, have multiple valid approaches, or require critical decisions. Use when implementation choices exist that could significantly affect outcomes.
>-
>-
Browse Bluesky content via API and firehose - search posts, fetch user activity, sample trending topics, read feeds and lists, analyze and categorize accounts. Supports authenticated access for personalized feeds. Use for Bluesky research, user monitoring, trend analysis, feed reading, firehose sampling, account categorization.
Generate progressive disclosure indexes for GitHub repositories to use as Claude project knowledge. Use when setting up projects referencing external documentation, creating searchable indexes of technical blogs or knowledge bases, combining multiple repos into one index, or when user mentions "index", "github repo", "project knowledge", or "documentation reference".
Analyze and categorize Bluesky accounts by topic using keyword extraction. Use when users mention Bluesky account analysis, following/follower lists, topic discovery, account curation, or network analysis.