Skill1.1k repo starsupdated today

moai-workflow-gan-loop

The moai-workflow-gan-loop implements an iterative Builder-Evaluator feedback loop for design refinement, using four-dimension scoring (Design Quality, Originality, Completeness, Functionality) to assess output quality. Use this workflow when you need structured iteration on design artifacts with automatic evaluation, stagnation detection, and optional Sprint Contract negotiation to clarify acceptance criteria before each iteration cycle.

View source Repository: moai-adk

Install in Claude Code

Copy

git clone --depth 1 https://github.com/modu-ai/moai-adk /tmp/moai-workflow-gan-loop && cp -r /tmp/moai-workflow-gan-loop/.moai/backups/legacy-cleanup-2026-05-23T103929Z/.claude/skills/moai-workflow-gan-loop ~/.claude/skills/moai-workflow-gan-loop

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# moai-workflow-gan-loop

Implements the Builder-Evaluator GAN loop for iterative design quality improvement. Absorbed from agency constitution Section 11 and Section 12. Integrates Sprint Contract Protocol, 4-dimension scoring, stagnation detection, and Evaluator Leniency Prevention.

All loop parameters are read from `.moai/config/sections/design.yaml`. Do not hardcode thresholds.

---

## Quick Reference

### Loop Parameters (from design.yaml)

```
design.gan_loop:
max_iterations: 5 # Maximum Builder-Evaluator cycles
pass_threshold: 0.75 # Score >= this value to exit loop
escalation_after: 3 # Escalate to user after N iterations without passing
improvement_threshold: 0.05 # Minimum score delta per iteration
strict_mode: false # If true, each dimension must pass individually
sprint_contract:
enabled: true
required_harness_levels: [thorough]
optional_harness_levels: [standard]
artifact_dir: ".moai/sprints"
max_negotiation_rounds: 2
```

### 4-Dimension Scoring Weights

| Dimension | Weight | Description |
| --- | --- | --- |
| Design Quality | 30% | Visual consistency, brand token compliance, WCAG AA |
| Originality | 25% | Not generic, not AI-slop, unique brand expression |
| Completeness | 25% | All BRIEF sections present, copy matches contract |
| Functionality | 20% | Responsive, accessible, all interactions work |

Overall score = weighted average of all four dimensions.

Pass condition: `overall_score >= pass_threshold` AND (if `strict_mode: true`) each dimension score >= `pass_threshold`.

---

## Implementation Guide

### GAN Loop Execution Flow

**Phase 1: Sprint Contract (when required by harness level)**

Required when `harness_level == thorough`.
Optional when `harness_level == standard` and user opts in.
Skipped when `harness_level == minimal`.

Sprint Contract generation:
1. Evaluator analyzes the BRIEF document and current iteration scope.
2. Evaluator produces the Sprint Contract document:
- `acceptance_checklist`: concrete, testable criteria for this iteration
- `priority_dimension`: which of the 4 dimensions to focus on
- `test_scenarios`: specific verification steps
- `pass_conditions`: minimum score per criterion
3. Builder reviews the contract:
- Accept: proceed with implementation
- Request adjustment: propose alternatives (max `max_negotiation_rounds` rounds)
4. Contract is saved to `design.gan_loop.sprint_contract.artifact_dir/sprint-N.json`

Constraint: Evaluator must not score on criteria outside the Sprint Contract. Builder must not claim criteria as met without evidence.

**Phase 2: Builder Execution**

Builder implements based on:
- Accepted Sprint Contract (if present)
- BRIEF document
- Copy JSON from `moai-domain-copywriting`
- Design tokens from `moai-domain-brand-design` or `moai-workflow-design` (Path A handler)

Builder outputs: code files, rendered previews (if Playwright available), implementation notes.

**Phase 3: Evaluator Scoring**

Evaluator scores against the 4 dimensions using the Evaluator Leniency Prevention mechanisms:

1. **Rubric Anchoring**: Score each dimension against the rubric (0.25 increments) with explicit justification. Scores without rubric reference are invalid.
2. **Evidence-Only Verdicts**: No PASS without concrete evidence (screenshot, test output, code reference).
3. **Anti-Pattern Cross-check**: Check known anti-patterns before finalizing. Any detected anti-pattern caps the relevant dimension score at 0.50.
4. **Must-Pass Firewall**: Copy integrity, mobile viewport, and WCAG AA are must-pass criteria. Failure in any must-pass = overall FAIL regardless of other scores.

Output: `evaluation-report-N.json` in `sprint_contract.artifact_dir`.

**Phase 4: Loop Decision**

```
if overall_score >= pass_threshold:
EXIT LOOP → proceed to next phase
elif iteration >= max_iterations:
ESCALATE → present failure report to user
elif stagnation_detected:
ESCALATE → present stagnation options
else:
ITERATE → pass feedback to Builder, increment N
```

**Phase 5: Iteration Feedback**

If looping back:
1. Evaluator generates targeted feedback per failed criterion.
2. Builder receives the feedback and previous Sprint Contract.
3. Previously passed criteria carry forward (no regression allowed).
4. New Sprint Contract is generated for failed criteria only.

---

### Stagnation Detection

Stagnation is detected when the score improvement between consecutive iterations is below `improvement_threshold` for 2 or more iterations.

Tracking:
- After each iteration, record `{iteration: N, score: X}` in the sprint artifact.
- Calculate `delta = score[N] - score[N-1]`.
- If `delta < improvement_threshold` for the last 2 iterations, flag stagnation.

When stagnation is detected, escalate to user via AskUserQuestion with three options:
1. Continue with current approach (Evaluator tries a different dimension focus)
2. Adjust criteria (user provides guidance or relaxes constraints)
3. Abort loop (accept current output as-is)

The escalation trigger at `escalation_after` iterations applies independently: if 3 iterations pass without a PASS score, escalate regardless of stagnation state.

---

### Evaluator Leniency Prevention Mechanisms

The following 5 mechanisms prevent score inflation and must be applied on every evaluation:

**Mechanism 1: Rubric Anchoring**

Score descriptions for each dimension:
- 0.25: Major defects, fails most criteria
- 0.50: Partial compliance, notable issues remain
- 0.75: Solid compliance, minor issues only
- 1.00: Full compliance, no issues found

Always state which rubric level applies and why before assigning a numeric score.

**Mechanism 2: Must-Pass Firewall**

The following conditions cause immediate FAIL regardless of other scores:
- Copy text differs from the original `copy.json` or BRIEF copy section
- AI slop detected: purple gradient (#8B5CF6-#6D28D9) as primary visual element with generic white cards
- Mobile viewport bro