codex-qa
The codex-qa skill provides isolated quality assurance testing for the omo Codex Light edition plugin by exercising it against a sandboxed CODEX_HOME environment with a local mock model provider, ensuring no interaction with the user's real Codex configuration. Use this skill whenever changes are made to the Codex plugin, installer configuration, app-server flow, or hooks, or when verifying plugin behavior, installation integrity, and TUI functionality in complete isolation.
git clone --depth 1 https://github.com/code-yeongyu/oh-my-openagent /tmp/codex-qa && cp -r /tmp/codex-qa/.agents/skills/codex-qa ~/.claude/skills/codex-qaSKILL.md
# Codex QA QA the omo Codex Light edition (`packages/omo-codex/`, shipped as lazycodex). We exercise OUR plugin in a REAL Codex while touching nothing of the user's setup: an isolated `CODEX_HOME` + a local mock model means no real API call and the real `~/.codex` is never read or written. Each helper script ships a `--self-test` that asserts its scenario against the live machine, so the scripts are both the QA tools and their own regression checks. Verified against `codex-cli 0.139.0` (node, jq, tmux, bun on macOS). Confirm with `codex --version`; check a flag with `codex <cmd> --help`. ## Golden rules (read before running anything) - **QA ONLY our plugin.** Everything that spawns codex uses an isolated `CODEX_HOME` (created by `cqa_mk_isolated_home`) and a LOCAL mock model provider (`cqa_start_mock`). Never QA against the real `~/.codex`, never hit a real model API. The bundled scripts enforce this; if you run codex by hand, `export CODEX_HOME="$(mktemp -d)/codex"; mkdir -p "$CODEX_HOME"` FIRST (a set `CODEX_HOME` must already exist or codex hard-errors). - **Prove the real home stayed clean.** Every script shasums `~/.codex/config.toml` before and after and asserts it is unchanged. If you script by hand, do the same. - **The interactive `codex` is a shell function** that injects `--profile quotio`. Bash scripts bypass it and get the real binary; never rely on the interactive alias. See [references/isolation.md](references/isolation.md). - **The first-party way to prove a hook fired is the app-server** notification stream (`hook/started` / `hook/completed`), not log scraping. See [references/app-server.md](references/app-server.md). - **The captured JSON / pane IS the evidence** — write it under `.omo/evidence/<YYYYMMDD>-<slug>/` (no evidence file == the QA did not happen). ## Setup ```bash cd <this-skill-dir> # .agents/skills/codex-qa bash scripts/lib/common.sh --self-check # confirm deps + isolation harness ``` ## Router: pick your case | You need to… | Run | Deep dive | |---|---|---| | Prove a plugin hook fires in a LIVE Codex turn (first-party) | `scripts/app-server-drive.sh --plugin` | [app-server.md](references/app-server.md) | | Prove the app-server driver itself works (no plugin, fast) | `scripts/app-server-drive.sh --self-test` | [app-server.md](references/app-server.md) | | Install the LOCAL build into an isolated home + assert it landed | `scripts/install-verify.sh --self-test` | [install-verify.md](references/install-verify.md) | | Pin ONE component's hook logic deterministically (no codex) | `scripts/hook-unit-probe.sh --self-test` | [components-hooks.md](references/components-hooks.md) | | Smoke the real TUI under tmux (boots, renders, survives) | `scripts/tui-smoke.sh --self-test` | [logging-debug.md](references/logging-debug.md) | | Watch runtime logs while QAing | (see reference; RUST_LOG / logs DB / `/debug-config`) | [logging-debug.md](references/logging-debug.md) | ## Scripts index (each is its own regression test) | Script | `--self-test` asserts | |---|---| | `scripts/lib/common.sh --self-check` | deps present; isolated `CODEX_HOME` is created inside a sandbox and auto-removed on exit; mock model serves the Responses SSE; real `~/.codex` unchanged | | `scripts/app-server-drive.sh` | `--self-test`: a bare turn completes and the mock assistant text comes back. `--plugin`: installs local omo, drives a turn, and asserts `hook/completed` for `sessionStart,userPromptSubmit` | | `scripts/install-verify.sh` | local omo installs into the isolated home; `config.toml` enables `omo@sisyphuslabs`; component bins + agent TOMLs linked in the sandbox; real `~/.codex` unchanged | | `scripts/hook-unit-probe.sh` | the `ultrawork` component injects `<ultrawork-mode>` on an `ulw` UserPromptSubmit (also a manual `--component/--event` mode) | | `scripts/tui-smoke.sh` | the real codex TUI boots in the isolated home, renders, and survives (no early exit); captures the pane | ## Match QA to your change scope - **Component / hook logic** (`packages/omo-codex/plugin/components/*`): `hook-unit-probe.sh` for the exact stdout, THEN `app-server-drive.sh --plugin` to prove the live wiring. See [components-hooks.md](references/components-hooks.md). - **Installer / config.toml** (`packages/omo-codex/src/install/*`): `install-verify.sh`. - **Anything that affects a live session** (hooks, agents, MCP wiring): `app-server-drive.sh --plugin`, and `tui-smoke.sh --plugin` if the TUI path matters. ## Capturing evidence ```bash ev=".omo/evidence/$(date +%Y%m%d)-codex-qa-<slug>"; mkdir -p "$ev" bash scripts/app-server-drive.sh --plugin > "$ev/app-server-drive.json" 2>&1 bash scripts/install-verify.sh --self-test > "$ev/install-verify.txt" 2>&1 ``` ## On `/debugging` There is no `/debugging` command in Codex. To observe a run: the app-server notification stream (above), `RUST_LOG=debug` on the app-server's stderr, the logs SQLite under `$CODEX_HOME`, the TUI's `/debug-config`, and the `codex debug …` subcommands. See [logging-debug.md](references/logging-debug.md).
Compare HEAD with the latest published npm versions and list all unpublished changes by release layer. Triggers: unpublished changes, changelog, what changed, whats new.
Read-only GitHub triage for issues AND PRs. 1 item = 1 background task (category: quick). Analyzes all open items and writes evidence-backed reports to /tmp/{datetime}/. Every claim requires a GitHub permalink as proof. NEVER takes any action on GitHub - no comments, no merges, no closes, no labels. Reports only. Triggers: 'triage', 'triage issues', 'triage PRs', 'github triage'.
Adversarial multi-agent planning skill. Self-orchestrates 5 hostile category members (unspecified-low, unspecified-high, deep, ultrabrain, artistry) via team-mode for ruthless cross-critique debate, distills only the defensible insights, then MANDATORILY hands the distilled insight bundle to the `plan` agent for executable plan formalization. Use when planning needs maximum rigor and surfacing of weak assumptions, blind spots, and over-engineering. Triggers: 'hyperplan', 'hpp', '/hyperplan', 'adversarial plan', 'hostile planning', 'cross-critique plan', '하이퍼플랜', '적대적 계획', '교차 비평'.
Easter egg command - about oh-my-opencode. Triggers: omomomo, about, easter egg.
QA opencode itself, per case: verify the CLI/terminal (opencode run, db, serve, export), prove a specific plugin hook/action/event fired via the SSE event stream, smoke-test the TUI under tmux, and investigate sessions in opencode's SQLite DB by id, title/name, or message text. Ships tested helper scripts (each with a --self-test) plus per-domain references. Use whenever someone wants to QA, smoke-test, verify, or debug opencode's CLI, HTTP server, plugin hooks/events, or TUI, or to find/inspect opencode sessions in the database. Triggers: opencode qa, qa opencode, test opencode, verify opencode hook, opencode session db, find opencode session by id/name/text, opencode tui test, opencode server health, opencode event stream.
Nuclear-grade 16-agent pre-publish release gate. Runs /get-unpublished-changes to detect all changes since last npm release, spawns up to 10 ultrabrain agents for deep per-change analysis, invokes /review-work (5 agents) for holistic review, and 1 oracle for overall release synthesis. Use before EVERY npm publish. Triggers: 'pre-publish review', 'review before publish', 'release review', 'pre-release review', 'ready to publish?', 'can I publish?', 'pre-publish', 'safe to publish', 'publishing review', 'pre-publish check'.
Publish oh-my-opencode to npm via GitHub Actions workflow. Argument: <patch|minor|major>. Triggers: publish, release, deploy, npm publish.
Remove unused code from this project with ultrawork mode, LSP-verified safety, atomic commits. Triggers: remove dead code, dead code, cleanup, remove unused.