moai-workflow-ci-loop
The `moai-workflow-ci-loop` skill monitors GitHub PR checks and automatically attempts to fix failures in mechanical categories. After a PR is created, it polls required checks every 30 seconds, classifies failures as mechanical or semantic, runs up to three safe patch iterations for mechanical issues, and escalates semantic failures like race conditions or panics directly to the user. Use this when a PR needs continuous CI validation and automatic remediation before merge.
git clone --depth 1 https://github.com/modu-ai/moai-adk /tmp/moai-workflow-ci-loop && cp -r /tmp/moai-workflow-ci-loop/.claude/skills/moai-workflow-ci-loop ~/.claude/skills/moai-workflow-ci-loopSKILL.md
# CI Loop (`moai-workflow-ci-loop`)
Unified CI watch + auto-fix loop. The orchestrator invokes this skill after `/moai sync`
Phase 4 (`gh pr create`) returns a PR number. The skill polls required checks, classifies
failures into mechanical vs semantic, attempts safe patches up to 3 iterations, and
escalates semantic failures via AskUserQuestion.
## Quick Reference
**Trigger**: `/moai sync` Phase 4 PR-create returns a PR number, or an existing PR needs
CI monitoring.
**Two phases, one skill**:
1. **Watch** — Poll `gh pr checks` every 30s, classify required vs auxiliary via
`.github/required-checks.yml` SSoT; exit on green/fail/timeout.
2. **Auto-fix** — On required-fail (exit 2), receive JSON handoff, run up to 3 patch
iterations, escalate semantic failures immediately.
**One-liner**:
```bash
MOAI_CIWATCH_GH=gh sh scripts/ci-watch/run.sh <PR_NUMBER> <BRANCH>
```
**Exit codes**:
- `0` — all required passed → ready-to-merge AskUserQuestion
- `1` — fatal error → surface remediation
- `2` — required failure → JSON handoff → auto-fix phase
- `3` — 30-min hard timeout → blocker message, return control
**HARD invariants**:
- AskUserQuestion is orchestrator-only — CLI, shell scripts, and `manager-quality` MUST NOT call it.
- Force-push is absolutely prohibited (`--force`, `-f`, `--force-with-lease` all banned).
- Max 3 auto-fix iterations; iteration 4+ triggers mandatory blocking AskUserQuestion.
- Semantic failures (race, deadlock, panic, assertion) are never auto-patched.
- Protected files: `.env*`, credentials, `.claude/settings*.json`, `.github/required-checks.yml`,
`scripts/ci-watch/run.sh`.
## Implementation Guide
### Phase 1 — Watch Loop
**Polling cadence**: 30 seconds minimum (GitHub API rate-limit). Override via
`CIWATCH_POLL_INTERVAL` env, never below 30 in production. Test mode uses
`MOAI_CIWATCH_NO_SLEEP=1` (single-tick exit).
**30-minute hard timeout**: `CIWATCH_TIMEOUT_SECONDS=1800` default. On timeout, exit 3.
Do not auto-restart.
**Required vs auxiliary**: Required checks live in `.github/required-checks.yml`
`branches.<pattern>.contexts`. Auxiliary checks listed under `auxiliary:` MUST NOT block
ready-to-merge. Hardcoding check names in scripts is prohibited.
**State file**: `.moai/state/ci-watch-active.flag` (YAML). Tracks `pr_number`, `started_at`,
`heartbeat_at`, `required_checks`, `abort_requested`. Heartbeat staleness > 90s allows
takeover. Abort: `moai pr watch --abort`.
**Background watch standardization**: For long-running PRs (5+ min), use
`gh pr checks <PR> --watch` invoked via `run_in_background: true`. Sleep + poll loops are
prohibited — they block the orchestrator's main session.
**Status report format** (stderr, state-change ticks only, no ANSI):
```
[ci-watch] PR #<N>: required 4/6 pass, 2 pending; advisory 0 fail
```
**Handoff schema on exit 2** — JSON with stable fields: `prNumber`, `branch`,
`failedChecks[]` (each entry `{name, runId, logUrl}`), `auxiliaryFailCount`,
`totalRequired`. Field stability: `name`, `runId`, `logUrl` are Wave 3 contract — do not
rename. Schema source: `internal/ciwatch/handoff.go::Handoff`.
### Phase 2 — Auto-Fix Loop
**Entry condition**: `ci-watch` exit 2 + valid JSON handoff. State file:
`.moai/state/ci-autofix-<PR>.json` (PR-scoped, 24-hour staleness threshold).
**OQ2 cadence matrix** (single source of truth for iteration behavior):
- iter 1, any mechanical sub_class → confirm + apply via AskUserQuestion (1st option =
"패치 적용 (권장)").
- iter 1, semantic/unknown → escalate (no patch attempt) via AskUserQuestion with
diagnosis report.
- iter 2-3, mechanical + sub_class=trivial → silent apply + log (no AskUserQuestion).
- iter 2-3, mechanical + sub_class=non-trivial → confirm + apply via AskUserQuestion.
- iter 2-3, semantic/unknown → escalate (no patch) via AskUserQuestion.
- iter 4+ → mandatory blocking AskUserQuestion (no timer, options: manual fix / revise
SPEC / abandon PR).
"trivial" = whitespace, gofmt/goimports, import-order (matches `classify.sh` `RX_TRIVIAL_*`).
**Patch commit rule**: Every patch = new commit. Format:
`fix(ci): auto-fix <classification> failure (iter <N>)`. After push, re-invoke
`scripts/ci-watch/run.sh` to restart the watch loop.
**Iteration 4+ escalation** (mandatory blocking, no silent timeout):
1. (권장) 직접 수동 수정 — investigate and fix manually
2. SPEC 수정 — revise the SPEC and restart implementation
3. PR 포기 — close the PR and abandon this approach
**manager-quality spawn prompt** injects: handoff JSON, classification + sub_class,
failed CI log + PR diff, mode directive (mechanical → propose unified-diff patch;
semantic/unknown → return diagnosis only, no patch). HARD: no AskUserQuestion call from
the subagent — return Markdown only.
**Audit log**: `.moai/reports/ci-autofix/<PR-NNN>-<YYYY-MM-DD>.md`. Append-only. Each
iteration records classification, sub_class, action, patch_sha, escalation_reason.
### Protected Files (never auto-modified)
- `**/.env`, `**/.env.*`
- `**/credentials*`, `**/*_key.json`, `**/*secret*`
- `.claude/settings.json`, `.claude/settings.local.json`
- `.github/required-checks.yml` (Wave 1 SSoT, read-only for Wave 2/3)
- `scripts/ci-watch/run.sh` (Wave 2 invariant)
If `manager-quality` proposes a patch touching any of these, reject and escalate.
### Go Helpers and Shell Scripts
Go: `internal/ciwatch/{classifier,handoff,state}.go` (classify required-vs-auxiliary,
handoff JSON schema, watch state file); `internal/cli/pr/watch.go`
(`EmitReadyToMergeReport`, `EmitFailureHandoff`). Shell: `scripts/ci-watch/run.sh` (main
loop, mock via `MOAI_CIWATCH_GH`); `scripts/ci-watch/lib/classify.sh` (yq + grep
fallback); `scripts/ci-autofix/log-fetch.sh` (failure log + PR diff);
`scripts/ci-autofix/classify.sh` (mechanical vs semantic).
**gh CLI compat**: requires `gh >= 2.50` for the `workflow` JSON field. On older `gh`,
`classify.sh` falls back to name-based heuristics from the `required:` list.
## Works Well With
- `manager-quality` — failure diagnosis +Claude Code upstream change tracker -> moai-adk update plan + docs sync workflow (dev-only). Tracks new CC release notes, classifies changes by impact tier, cross-references official docs, generates update plan at .moai/research/ or .moai/specs/, and synchronizes docs-site 4-locale + README. NOT distributed to user projects.
GitHub Workflow - Manage issues and review PRs with Agent Teams (dev-only). NOT distributed to user projects.
MoAI-ADK production release via Enhanced GitHub Flow (CLAUDE.local.md §18). Creates release/vX.Y.Z branch, version bump, CHANGELOG (bilingual), PR to main, merge commit (NOT squash), then scripts/release.sh for tag + GoReleaser. Hotfix support via --hotfix flag. All git operations delegated to manager-git. Quality failures escalate to expert-debug. NOT distributed to user projects (dev-only).
Run the 7-phase /moai brain ideation workflow to convert ideas into validated proposals
Identify and safely remove dead code with test verification
Scan codebase and generate architecture documentation in codemaps/
Analyze test coverage, identify gaps, and generate missing tests
Hybrid design workflow — Claude Design import (path A) or code-based brand design (path B)