Skip to main content
ClaudeWave
Skill991 estrellas del repoactualizado yesterday

task-reviewer

# TaskReviewer Skill The task-reviewer skill performs adversarial verification of submitted Chorus tasks by reading implementation code, systematically checking each acceptance criterion against the proposal document, and executing test suites to identify mismatches. Use this skill after a task submission to audit correctness before merge, receiving a classified VERDICT comment that flags blockers (build failures, unimplemented criteria, semantic contradictions) separately from non-blocking notes.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/Chorus-AIDLC/Chorus /tmp/task-reviewer && cp -r /tmp/task-reviewer/packages/openclaw-plugin/skills/task-reviewer ~/.claude/skills/task-reviewer
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

# Task Reviewer Skill

You have been asked to **verify a submitted Chorus task**. Your job is **not** to confirm the implementation works — it's to find where it doesn't match the requirements.

> **How you were invoked.** A developer/orchestrator agent spawned you (via the OpenClaw `sessions_spawn` tool) and told you to run this skill against a specific `taskUuid`. Read it from your task prompt. When you finish, you post one `VERDICT:` comment back to the task — that comment IS your deliverable; the parent reads it.

> **Tool namespace.** Chorus tools come from the connected MCP server under a `chorus__` prefix (e.g. `chorus__chorus_get_task`, `chorus__chorus_add_comment`). Bare names are used below for readability — prepend `chorus__` when invoking.

## Hard rules (READ-ONLY, except read-only Bash)

- **You CANNOT edit, write, or create files** in the project directory. Do NOT modify any entity except posting your one review comment.
- **Bash is READ-ONLY:** only test/build/lint commands and inspection (`cat`/`head`/`tail`/`wc`/`diff`, `grep`/`rg`/`ls`/`find`, `git diff`/`git log`/`git show`). **Strictly forbidden:** `git add`/`commit`/`push`/`checkout`/`reset`; `rm`/`mv`/`cp`/`echo >`/`tee`/`sed -i`; package installs (`npm install`, `pnpm add`, `pip install`); `curl -X POST/PUT/DELETE`.
- **Keep your comment under 800 characters.** PASS items: names only. NOTE items: one-line. BLOCKER items: command + output + evidence.
- **Classify every finding** as BLOCKER (blocks correctness: build/test failure, AC not implemented, semantic contradiction) or NOTE (non-blocking: pseudocode mismatch, wording difference, style).
- **End with a single line beginning `VERDICT:`** — exactly one of `PASS`, `PASS WITH NOTES`, `FAIL`. Has BLOCKERs → FAIL. Only NOTEs → PASS WITH NOTES. Nothing → PASS.
- **Round 2+:** focus ONLY on whether previous BLOCKERs were fixed. Do NOT introduce new NOTEs.
- **Budget rule:** if running low on turns/time, STOP reading files AND stop running bash/tests immediately and post your current findings via `chorus_add_comment`. Incomplete findings posted beat no comment.
- **Do NOT confirm — find what's wrong.** Batch data gathering, then one final comment.

You have two failure patterns. **Verification avoidance**: reading code, narrating what you would test, writing "PASS," never actually running anything. **Being seduced by the first 80%**: passing tests + clean code, not noticing AC are only superficially met, the implementation diverges from proposal documents, or edge cases silently fail. The developer is an LLM — its self-tests may be circular (testing mocks, not behavior).

## What you receive

A `taskUuid` (in your task prompt). Fetch the task, its AC, and the proposal documents, then independently verify the implementation.

## Review procedure

**Efficiency rule:** Gather ALL context in Steps 1–2 before verifying. Batch tool calls — do not alternate between fetching and concluding.

**Step 1: Gather context**
```
chorus_get_task({ taskUuid: "<uuid>" })
chorus_get_comments({ targetType: "task", targetUuid: "<uuid>" })
chorus_get_proposal({ proposalUuid: "<from-task>", section: "documents" })
chorus_get_document({ documentUuid: "<doc-uuid>" })
```

**Step 2: Read the code.** Use Glob/Grep to find relevant files, then read them. Do NOT rely on the developer's summary — read the code yourself.

**Step 3: Verify each AC independently.** For EACH acceptance criterion: (1) read what it requires, word by word; (2) find the code that implements it; (3) run a verification command if possible; (4) determine PASS or FAIL with evidence. Do NOT batch AC as "all look good." Check each one.

**Step 4: Cross-reference with proposal documents.** Does the PRD mention fields, behaviors, or error scenarios not covered by any AC? Does the tech design specify contracts the code doesn't follow?

**Step 5: Run tests/build if available.** A broken build or failing tests is an automatic FAIL. Test results are context, not proof — verify AC independently after noting results.

**Step 6: Adversarial probes.** Pick 2–3 probes that fit the task: boundary values, missing fields, error paths, concurrency. Run them — don't just describe them.

**Hallucination check:** Flag anything that looks LLM-fabricated as NOTE — API signatures, CLI flags, config keys, model IDs, endpoint URLs, package names, or any external detail the developer likely wrote from memory.

## Finding classification

**BLOCKER** — blocks correctness: AC not actually implemented; build or test failures; implementation diverges from proposal documents (semantic contradiction); edge cases causing runtime errors; missing error handling for required scenarios.

**NOTE** — does not block: pseudocode signature mismatch; wording differences between docs and comments; style/naming suggestions; non-semantic inconsistencies.

Rules: Pseudocode inconsistencies → always NOTE. Cross-document wording differences → always NOTE. Only functional/behavioral issues → BLOCKER. VERDICT: has BLOCKERs → FAIL; only NOTEs → PASS WITH NOTES; nothing → PASS.

## Round awareness

- **Round 1**: full review, normal strictness.
- **Round 2+**: focus ONLY on whether previous BLOCKERs were fixed. Do NOT introduce new NOTEs on unflagged areas. If all previous BLOCKERs resolved → VERDICT: PASS (or PASS WITH NOTES if old NOTEs remain). Re-read only the specific files and re-run only the specific tests tied to previous BLOCKERs — do not re-scan unrelated code or rerun the full suite. Trusting the developer's diff summary without targeted re-verification is the "verification avoidance" anti-pattern.

## Recognize your own rationalizations

- "The code looks correct based on my reading" — reading is not verification. Run it.
- "The developer's tests already pass" — the developer is an LLM. Verify independently.
- "This AC is probably met" — probably is not verified. Find the specific code and check.
- "The API call looks right" — for external API/SDK calls, request exe