Skill802 repo starsupdated 3d ago

improve

The /improve command autonomously scores a target against a rubric and iteratively enhances it across multiple dimensions. Use it when you have a rubric file at `.planning/rubrics/{target}.md` and want to systematically improve code quality by attacking weak scoring axes until reaching a plateau or threshold. Campaign mode with `--n` or `--continue` flags maintains persistent state for daemon-driven multi-loop execution.

View source Repository: Citadel

Install in Claude Code

Copy

git clone --depth 1 https://github.com/SethGammon/Citadel /tmp/improve && cp -r /tmp/improve/skills/improve ~/.claude/skills/improve

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# /improve — Autonomous Quality Engine

## Orientation

**Use when:** Scoring a target against a rubric and iteratively improving it. Rubric required at `.planning/rubrics/{target}.md` (Phase 0 creates one if missing).

**Don't use when:** Refactoring without a rubric (use `/refactor`), one-time code review (use `/review`), or debugging a specific bug (use `/systematic-debugging`).

## Invocation

```
/improve {target}            # Loop until plateau or all axes >= 8.0
/improve {target} --n=3      # Run exactly N loops then stop
/improve {target} --axis={name}  # Force-attack a specific axis (skips scoring)
/improve {target} --score-only   # Score and report, no attack
/improve {target} --continue     # Resume from campaign state (used by daemon)
/improve citadel             # Targets Citadel itself
```

`target` is a slug that maps to `.planning/rubrics/{target}.md`.
If no rubric exists, run Phase 0 first.

## Campaign Mode

When invoked with `--n` or `--continue`, improve operates in **campaign mode** and maintains a campaign file that daemon can attach to.

**Campaign file:** `.planning/campaigns/improve-{target}.md`, created automatically on the first invocation with `--n` (full template: docs/QUALITY_LOOPS.md#campaign-file-template).
Frontmatter: `version`, `id` (`improve-{target}-{ISO-date-slug}`), `status: active`, `type: improve`, `target`, `total_loops` ({n} or `unlimited`), `completed_loops: 0`, `current_level` (from rubric frontmatter), `estimated_cost_per_loop: 12`, `started`.
Body: status and direction lines, a Loop History table (`Loop | Axis Attacked | Outcome | Score Movement`), and a Continuation State block (`next_loop`, `last_scorecard_log`, `last_outcome`, `phase_within_loop`, `level_up_triggered`).

### Campaign lifecycle

Update `phase_within_loop` at each phase: `scoring` → `selected-{axis}` → `attacking-{axis}` → `verifying` → `not-started`.

On loop complete: increment `completed_loops`, update `next_loop`/`last_scorecard_log`/`last_outcome`, append Loop History row.

### The `--continue` flag

1. Read `.planning/campaigns/improve-{target}.md` — error if missing or `status` not `active`
2. If `completed_loops >= total_loops`: mark completed, exit
3. If `phase_within_loop` is not `not-started`: restart current loop from Phase 1 (interrupted mid-loop)
4. Load `last_scorecard_log` for delta comparison, then run Phase 1 onwards

## Protocol

### Phase 0: Rubric Bootstrap (one-time, requires human approval)

Run only when `.planning/rubrics/{target}.md` does not exist.

1. Read competitive research from `.planning/research/` if available
2. Spawn `/research --parallel` to survey comparable products if no research exists
3. Draft 8-14 axes organized into 3-5 categories, each with:
   - Weight (0.0–1.0), Category, three anchors (0/5/10), verification specs (programmatic/structural/perceptual), research inputs
4. Present draft rubric to the user with rationale for each axis
5. **STOP. Do not proceed until the user approves the rubric.**
   - If the AskUserQuestion tool is available, present the gate with it. Options: "Approve as drafted", "Adjust axes", "Adjust weights". On adjust: revise, re-present.
   - If unavailable: ask as a plain text question and wait.
   - Record the answer in the campaign file as `rubric_approved: {answer}`.
6. Write approved rubric to `.planning/rubrics/{target}.md`

### Phase 1: Score

Score every axis in the rubric. No shortcuts. No cached scores from the previous loop.

#### 1a. Programmatic checks (run first, in parallel)

Execute the programmatic verification steps from the rubric. A programmatic failure caps that axis at 5 regardless of evaluator scores. Record raw results: which checks passed, which failed, what the failure was.

#### 1b. Structural analysis

Execute the structural checks from each axis's verification spec: file path existence, frontmatter schema consistency, benchmark coverage ratios, link rot, and cross-reference accuracy (check descriptions: docs/QUALITY_LOOPS.md#structural-check-types).

#### 1c. Perceptual scoring panel (three independent evaluators)

Spawn three evaluator agents in parallel. Each receives the rubric with all axis definitions and anchors, read access to the target, its persona (A/B/C as defined in the rubric's Scoring Protocol), and the instruction to score every axis 0-10 with a one-sentence justification per axis (input list: docs/QUALITY_LOOPS.md#evaluator-panel).

Each evaluator scores independently. For each axis:
- Final score = minimum of the three evaluators (plus programmatic cap if applicable)
- If any two evaluators disagree by > 3 points: flag the axis as `needs-refinement`

`needs-refinement` axes are logged but still scored. Do not halt on evaluator disagreement.

#### 1d. Compile scorecard

Compile a table with columns `Axis | A | B | C | Prog | Final | Delta | Flag` (layout: docs/QUALITY_LOOPS.md#scorecard-format).
Final = min(A, B, C), then apply programmatic cap (sets Flag=cap). Delta = current − prior loop score (empty on loop 1).

### Phase 2: Select

Choose the single axis to attack this loop.

**Selection formula:**
```
score(axis) = (10 - current_score) × weight × effort_multiplier × recency_penalty
```

- `effort_multiplier`: low = 1.0, medium = 0.7, high = 0.4
- `recency_penalty`: 0.5 if attacked in previous 2 loops, otherwise 1.0
- Effort tiers: **low** < 1hr, **medium** 1-3hrs, **high** 3+hrs

If `--axis` flag was set, skip selection and attack the specified axis.

Announce the selection:
```
Selected: {axis_name} (score: {n}/10, weight: {w}, effort: {e}, selection score: {s})
Rationale: {one sentence on why this axis now, not another}
```

### Phase 3: Attack

Execute the improvement. Dispatch strategy depends on the axis category (expanded per-category playbooks: docs/QUALITY_LOOPS.md#attack-dispatch-strategies).

**ISOLATION MANDATE:** When dispatching to `/experiment`, `/fleet`, or `/research --parallel`, always use the Agent tool with `isolation: "worktree"`. Sub-ag