app-control
The app-control skill provides low-level keyboard, mouse, and screenshot tools for controlling a specific named macOS application by bypassing the system Accessibility tree. Use it when working with emulators, games, OpenGL canvases, or custom-rendered applications where standard UI navigation fails; always prefer the computer-use skill for general macOS interface interaction.
git clone --depth 1 https://github.com/vellum-ai/vellum-assistant /tmp/app-control && cp -r /tmp/app-control/assistant/src/config/bundled-skills/app-control ~/.claude/skills/app-controlSKILL.md
This skill exposes the `app_control_*` proxy tools for driving a single named macOS application via raw input — keyboard, mouse, screenshot — that bypasses the system Accessibility tree. Use it only when explicitly directed to a specific app where the AX tree is unhelpful (emulators, games, OpenGL canvases, custom-rendered Electron apps). For general macOS UI navigation prefer the `computer-use` skill. Tools in this skill are proxy tools — execution is forwarded to the connected macOS client, never handled locally by the assistant. ## Cadence Take 2-3 actions per turn, then yield with a short narration so the user can interject. Do not chain long sequences without surfacing what you are doing. ## Always observe before acting Call `app_control_observe` before your first input action whenever the screen state matters (e.g. you need to know what is on screen, where a UI element is, or whether the app is even running). Re-observe after actions that may have moved the window or changed visibility. `observe` waits a short settle delay (default ~200ms) before capturing so the target app and the WindowServer can flush pending input and composite a fresh frame. If the captured screenshot looks one input behind the latest state (common with emulators or other slow-feedback apps), pass a larger `settle_ms`. For static UIs where you just want a quick snapshot, pass `settle_ms: 0` to skip the wait. ## Input choice - Prefer `app_control_sequence` over multiple back-to-back `app_control_press` calls when sending an ordered batch of presses (e.g. menu navigation, repeated movement). Sequence runs in a single round-trip — the target app is activated once at the start and the keys are sent serially without any window for keyboard focus to drift to another app between presses. Each step may carry its own `duration_ms` (hold) and `gap_ms` (pause after). - Prefer `app_control_combo` over rapid sequential `app_control_press` for simultaneous inputs (e.g. cmd+shift+4). `combo` holds every key at once; sequential presses interleave key-down and key-up events. - Use `app_control_type` for literal text into a focused field. ## Coordinate caveat `app_control_click` and `app_control_drag` use **window-relative** coordinates. The window may move or resize between observation and click — if you are uncertain whether the window has shifted, re-observe first. ## App targeting Use bundle IDs (e.g. `com.example.app`) when possible — they are the most reliable identifier. Fall back to localized process names if a bundle ID is unavailable. ## Ending the session Call `app_control_stop` when you are done. Do **not** auto-quit the controlled app — `stop` only ends the app-control session, leaving the app running.
>
>
>
>
Check Vellum Assistant architecture and package boundaries. Use when editing imports, moving code, adding endpoints, touching assistant/gateway/client/skill boundaries, or reviewing architecture-sensitive changes.
Review Vellum Assistant code changes for correctness, repo-specific quality rules, security risks, and missing validation. Use when reviewing diffs, preparing a PR, finishing implementation work, or when the user asks for a code review, quality pass, or pre-merge check in this repository.
Guide Vellum Assistant feature flag changes and rollout hygiene. Use when adding, editing, reviewing, or documenting assistant feature flags, rollout-gated behavior, or platform flag follow-up work.
Validate Vellum Assistant database and workspace migrations. Use when adding, editing, reviewing, or testing migrations, release-note migrations, persisted schemas, workspace file formats, or data backfills.