Skill606 repo starsupdated today
improve
The /improve command autonomously scores a target against a rubric and iteratively enhances it across multiple dimensions. Use it when you have a rubric file at `.planning/rubrics/{target}.md` and want to systematically improve code quality by attacking weak scoring axes until reaching a plateau or threshold. Campaign mode with `--n` or `--continue` flags maintains persistent state for daemon-driven multi-loop execution.
Install in Claude Code
Copygit clone --depth 1 https://github.com/SethGammon/Citadel /tmp/improve && cp -r /tmp/improve/skills/improve ~/.claude/skills/improveThen start a new Claude Code session; the skill loads automatically.
Definition
SKILL.md
# /improve — Autonomous Quality Engine
## Orientation
**Use when:** Scoring a target against a rubric and iteratively improving it. Rubric required at `.planning/rubrics/{target}.md` (Phase 0 creates one if missing).
**Don't use when:** Refactoring without a rubric (use `/refactor`), one-time code review (use `/review`), or debugging a specific bug (use `/systematic-debugging`).
## Invocation
```
/improve {target} # Loop until plateau or all axes >= 8.0
/improve {target} --n=3 # Run exactly N loops then stop
/improve {target} --axis={name} # Force-attack a specific axis (skips scoring)
/improve {target} --score-only # Score and report, no attack
/improve {target} --continue # Resume from campaign state (used by daemon)
/improve citadel # Targets Citadel itself
```
`target` is a slug that maps to `.planning/rubrics/{target}.md`.
If no rubric exists, run Phase 0 first.
## Campaign Mode
When invoked with `--n` or `--continue`, improve operates in **campaign mode** and maintains a campaign file that daemon can attach to.
**Campaign file:** `.planning/campaigns/improve-{target}.md`, created automatically on the first invocation with `--n` (full template: docs/QUALITY_LOOPS.md#campaign-file-template).
Frontmatter: `version`, `id` (`improve-{target}-{ISO-date-slug}`), `status: active`, `type: improve`, `target`, `total_loops` ({n} or `unlimited`), `completed_loops: 0`, `current_level` (from rubric frontmatter), `estimated_cost_per_loop: 12`, `started`.
Body: status and direction lines, a Loop History table (`Loop | Axis Attacked | Outcome | Score Movement`), and a Continuation State block (`next_loop`, `last_scorecard_log`, `last_outcome`, `phase_within_loop`, `level_up_triggered`).
### Campaign lifecycle
Update `phase_within_loop` at each phase: `scoring` → `selected-{axis}` → `attacking-{axis}` → `verifying` → `not-started`.
On loop complete: increment `completed_loops`, update `next_loop`/`last_scorecard_log`/`last_outcome`, append Loop History row.
### The `--continue` flag
1. Read `.planning/campaigns/improve-{target}.md` — error if missing or `status` not `active`
2. If `completed_loops >= total_loops`: mark completed, exit
3. If `phase_within_loop` is not `not-started`: restart current loop from Phase 1 (interrupted mid-loop)
4. Load `last_scorecard_log` for delta comparison, then run Phase 1 onwards
## Protocol
### Phase 0: Rubric Bootstrap (one-time, requires human approval)
Run only when `.planning/rubrics/{target}.md` does not exist.
1. Read competitive research from `.planning/research/` if available
2. Spawn `/research --parallel` to survey comparable products if no research exists
3. Draft 8-14 axes organized into 3-5 categories, each with:
- Weight (0.0–1.0), Category, three anchors (0/5/10), verification specs (programmatic/structural/perceptual), research inputs
4. Present draft rubric to the user with rationale for each axis
5. **STOP. Do not proceed until the user approves the rubric.**
- If the AskUserQuestion tool is available, present the gate with it. Options: "Approve as drafted", "Adjust axes", "Adjust weights". On adjust: revise, re-present.
- If unavailable: ask as a plain text question and wait.
- Record the answer in the campaign file as `rubric_approved: {answer}`.
6. Write approved rubric to `.planning/rubrics/{target}.md`
### Phase 1: Score
Score every axis in the rubric. No shortcuts. No cached scores from the previous loop.
#### 1a. Programmatic checks (run first, in parallel)
Execute the programmatic verification steps from the rubric. A programmatic failure caps that axis at 5 regardless of evaluator scores. Record raw results: which checks passed, which failed, what the failure was.
#### 1b. Structural analysis
Execute the structural checks from each axis's verification spec: file path existence, frontmatter schema consistency, benchmark coverage ratios, link rot, and cross-reference accuracy (check descriptions: docs/QUALITY_LOOPS.md#structural-check-types).
#### 1c. Perceptual scoring panel (three independent evaluators)
Spawn three evaluator agents in parallel. Each receives the rubric with all axis definitions and anchors, read access to the target, its persona (A/B/C as defined in the rubric's Scoring Protocol), and the instruction to score every axis 0-10 with a one-sentence justification per axis (input list: docs/QUALITY_LOOPS.md#evaluator-panel).
Each evaluator scores independently. For each axis:
- Final score = minimum of the three evaluators (plus programmatic cap if applicable)
- If any two evaluators disagree by > 3 points: flag the axis as `needs-refinement`
`needs-refinement` axes are logged but still scored. Do not halt on evaluator disagreement.
#### 1d. Compile scorecard
Compile a table with columns `Axis | A | B | C | Prog | Final | Delta | Flag` (layout: docs/QUALITY_LOOPS.md#scorecard-format).
Final = min(A, B, C), then apply programmatic cap (sets Flag=cap). Delta = current − prior loop score (empty on loop 1).
### Phase 2: Select
Choose the single axis to attack this loop.
**Selection formula:**
```
score(axis) = (10 - current_score) × weight × effort_multiplier × recency_penalty
```
- `effort_multiplier`: low = 1.0, medium = 0.7, high = 0.4
- `recency_penalty`: 0.5 if attacked in previous 2 loops, otherwise 1.0
- Effort tiers: **low** < 1hr, **medium** 1-3hrs, **high** 3+hrs
If `--axis` flag was set, skip selection and attack the specified axis.
Announce the selection:
```
Selected: {axis_name} (score: {n}/10, weight: {w}, effort: {e}, selection score: {s})
Rationale: {one sentence on why this axis now, not another}
```
### Phase 3: Attack
Execute the improvement. Dispatch strategy depends on the axis category (expanded per-category playbooks: docs/QUALITY_LOOPS.md#attack-dispatch-strategies).
**ISOLATION MANDATE:** When dispatching to `/experiment`, `/fleet`, or `/research --parallel`, always use the Agent tool with `isolation: "worktree"`. Sub-ag