Skill66.7k repo starsupdated today

visual-qa

The visual-qa skill provides automated verification of web and terminal user interfaces by capturing reference and actual renders, computing objective metrics (diff ratio, similarity score, overflow detection, border alignment), and synthesizing a pass/fail verdict through two parallel oracle reviews of design-system integrity and functional fidelity. Use it after building or modifying any UI, especially when verifying against mocks, detecting regressions, or checking CJK text rendering and terminal layout correctness.

View source Repository: oh-my-openagent

Install in Claude Code

Copy

git clone --depth 1 https://github.com/code-yeongyu/oh-my-openagent /tmp/visual-qa && cp -r /tmp/visual-qa/packages/shared-skills/skills/visual-qa ~/.claude/skills/visual-qa

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# Visual QA - Dual-Oracle Web and TUI Verification

Verify a rendered UI against intent using objective script evidence plus two parallel read-only oracle passes, then synthesize one good/bad verdict. The script numbers focus the reviewers. They are not the verdict.

## Purpose and when to use

- Use after you build or change any UI, before calling it done. Covers web/page UIs and TUI/terminal UIs.
- Use when output must match a mock, a baseline, or a stated design intent; when you suspect a regression; when CJK (Korean/Japanese/Chinese) text may clip, misalign, or wrap awkwardly; when a claimed design system might actually be a flat image; when a terminal layout may overflow or its borders may break.
- Skip when there is no rendered surface (pure backend or library logic with no visual or terminal output). For broad post-implementation review use review-work; this skill is the visual specialist.

In the commands below, `$SKILL_DIR` is this skill's own directory (the folder containing this SKILL.md). The bundled script lives at `scripts/cli.ts` inside it.

## Step 1 - Detect the surface

- Web/page UI: renders in a browser (HTML/CSS/JS, components, canvas, SVG). Evidence is screenshots.
- TUI/terminal UI: renders as text in a terminal (box-drawing, panes, status lines, REPL/TUI apps). Evidence is terminal captures.

If the change touches both, run both capture tracks and feed both into the passes.

## Step 2 - Capture objective reference evidence

### Coverage - capture every page, not a sample

A surface is rarely one screen. If the UI has multiple pages, slides, routes, tabs, modal states, viewport breakpoints, or scroll positions, enumerate the COMPLETE set first and capture every one. A 40-slide deck means 40 captures, not 5. Never sample a few representative screens and generalize: the defect you miss is always on the page you did not open.

The verdict is per page. One failing page fails the whole surface, so "most pages look fine" is not a PASS. Record the enumerated list (page count and identifiers) so the reviewer in Step 3 can confirm nothing was skipped.

### Evidence must be fresh

Every gate runs on captures produced AFTER the last edit to the rendered source. If any screenshot, PDF, capture, or QA JSON is older than the source file it claims to verify, it is stale and invalid - regenerate it before trusting it. Never report a PASS from an artifact you did not just produce against the current build.

### Web

1. Capture a REFERENCE image: the user's mock/target, or a known-good baseline. Save as PNG.
2. Capture the ACTUAL rendered screenshot at the same viewport size using the project's browser tooling (the playwright, agent-browser, or dev-browser skill). Save as PNG. If none is configured or available, install [agent-browser](https://github.com/vercel-labs/agent-browser) (`bun add -g agent-browser && agent-browser install`) and capture with it — see `$SKILL_DIR/references/agent-browser-setup.md` for the full setup, including how to shoot a fixed-viewport screenshot.
3. Run the diff and keep the JSON:

```
bun "$SKILL_DIR/scripts/cli.ts" image-diff <reference.png> <actual.png>
```

Key fields: `dimensionsMatch`, `diffRatio` (0..1), `similarityScore` (0..100), `alphaChannelIntact`, `hotspots[]` (grid regions ranked by `diffRatio`).

### TUI

1. Capture plain text and an ANSI-preserving copy:

```
tmux capture-pane -p > capture.txt
tmux capture-pane -e -p > capture-ansi.txt
```

2. When the TUI evidence will be attached to a PR or reviewed visually, render
the capture through the browser helper from the repository root:

```
node script/qa/web-terminal-visual-qa.mjs --title "TUI Visual QA" \
--from-file capture.txt \
--evidence-dir .omo/evidence/<slug>/tui-web-terminal
```

This produces `terminal.png`, `terminal.html`, `terminal.txt`,
`terminal-ansi.txt`, and `metadata.json`. Treat this as the standard TUI visual
artifact pattern for terminal screenshots. If the project is outside this repo,
copy the same pattern: terminal capture -> browser-rendered page -> PNG +
metadata with cleanup receipt.

3. Run the check with the REAL terminal width and keep the JSON:

```
bun "$SKILL_DIR/scripts/cli.ts" tui-check capture.txt --cols <N>
```

Key fields: `maxWidth`, `overflowLines[]`, `borderMisaligned`, `wideCharColumns[]`, `hasAnsi`.

This JSON (diff ratio, similarity score, hotspots or overflow lines, border alignment, wide-char columns, alpha) is REFERENCE evidence to aim the reviewers. It is not the verdict by itself.

## Step 3 - Dispatch two read-only QA subagents in parallel

This independent review is REQUIRED before any "done" claim. Do not self-review inside the main agent and call the UI verified - a self-graded pass is the failure mode this step exists to stop. Dispatch it yourself, every time, without waiting to be told. Give each reviewer the captures for every enumerated page from Step 2, not a sample, and tell it the page count so it can confirm none were skipped.

Dispatch through your harness's own subagent tool. In OpenCode: `task(subagent_type="oracle", ...)`. In Codex: `multi_agent_v1.spawn_agent({"message": "...", "agent_type": "lazycodex-gate-reviewer", "fork_context": false})` (the code blocks below are written in OpenCode `task(...)` form; translate them to that `spawn_agent` call, putting the full prompt in `message`).

Send BOTH calls in a single message so they run concurrently. Each oracle is read-only: it reviews and reports, it cannot modify files. Each returns PASS, REVISE, or FAIL with concrete, located findings. Pass A proves the surface is a real design-system implementation, not a mock-only or faked-image substitute. Pass B directly opens screenshots and inspects source/content for visual and CJK defects.

Paste evidence directly into each prompt: source code, the plain-text TUI captures, the script JSON, and the screenshot paths plus your described observations for web. The two passes differ in depth by charter, not by any model or effort

More from this repository

get-unpublished-changesSkill

Compare HEAD with the latest published npm versions and list all unpublished changes by release layer. Triggers: unpublished changes, changelog, what changed, whats new.

github-triageSkill

Read-only GitHub triage for issues AND PRs. 1 item = 1 background task (category: quick). Analyzes all open items and writes evidence-backed reports to /tmp/{datetime}/. Every claim requires a GitHub permalink as proof. NEVER takes any action on GitHub - no comments, no merges, no closes, no labels. Reports only. Triggers: 'triage', 'triage issues', 'triage PRs', 'github triage'.

hyperplanSkill

Adversarial multi-agent planning skill. Self-orchestrates 5 hostile category members (unspecified-low, unspecified-high, deep, ultrabrain, artistry) via team-mode for ruthless cross-critique debate, distills only the defensible insights, then MANDATORILY hands the distilled insight bundle to the `plan` agent for executable plan formalization. Use when planning needs maximum rigor and surfacing of weak assumptions, blind spots, and over-engineering. Triggers: 'hyperplan', 'hpp', '/hyperplan', 'adversarial plan', 'hostile planning', 'cross-critique plan', '하이퍼플랜', '적대적 계획', '교차 비평'.

omomomoSkill

Easter egg command - about oh-my-opencode. Triggers: omomomo, about, easter egg.

opencode-qaSkill

QA opencode itself, per case: verify the CLI/terminal (opencode run, db, serve, export), prove a specific plugin hook/action/event fired via the SSE event stream, smoke-test the TUI under tmux, and investigate sessions in opencode's SQLite DB by id, title/name, or message text. Ships tested helper scripts (each with a --self-test) plus per-domain references. Use whenever someone wants to QA, smoke-test, verify, or debug opencode's CLI, HTTP server, plugin hooks/events, or TUI, or to find/inspect opencode sessions in the database. Triggers: opencode qa, qa opencode, test opencode, verify opencode hook, opencode session db, find opencode session by id/name/text, opencode tui test, opencode server health, opencode event stream.

pre-publish-reviewSkill

Nuclear-grade 16-agent pre-publish release gate. Runs /get-unpublished-changes to detect all changes since last npm release, spawns up to 10 ultrabrain agents for deep per-change analysis, invokes /review-work (5 agents) for holistic review, and 1 oracle for overall release synthesis. Use before EVERY npm publish. Triggers: 'pre-publish review', 'review before publish', 'release review', 'pre-release review', 'ready to publish?', 'can I publish?', 'pre-publish', 'safe to publish', 'publishing review', 'pre-publish check'.

publishSkill

Publish oh-my-opencode to npm via GitHub Actions workflow. Argument: <patch|minor|major>. Triggers: publish, release, deploy, npm publish.

remove-deadcodeSkill

Remove unused code from this project with ultrawork mode, LSP-verified safety, atomic commits. Triggers: remove dead code, dead code, cleanup, remove unused.