ha-mac-control
ha-mac-control provides a standardized workflow for automating macOS desktop interactions through the Hope Agent app, enabling control of windows, applications, Dock, Spaces, menus, dialogs, and clipboard operations. Load this skill when the user requests macOS automation, app launching, UI element clicking, window management, or visual element location, or when using the `mac_control` command that requires fresh state observation before each action to handle macOS volatility like focus stealing and expired accessibility identifiers.
git clone --depth 1 https://github.com/shiwenwen/hope-agent /tmp/ha-mac-control && cp -r /tmp/ha-mac-control/skills/ha-mac-control ~/.claude/skills/ha-mac-controlSKILL.md
# Hope Agent Mac Control `mac_control` operates the user's macOS desktop from the authorized Hope Agent app process. macOS UI state is volatile: apps steal focus, AX IDs expire, sheets attach to windows, and multiple windows often share similar titles. Use a fresh observation before every meaningful action. ## Standard Loop Use this loop unless the user explicitly asks for a single read-only query: ``` 1. mac_control(action="status") 2. mac_control(action="apps", op="frontmost" | "search" | "installed") 3. observe: snapshot / visual.observe / elements.find / windows.list / dock.list / spaces.list / menu.list / menu.popover / dialog.inspect 4. act: apps.activate/launch, dock.launch, spaces.switch, windows.*, act.*, menu.click, dialog.* 5. verify: wait, snapshot, windows.list, or dialog.inspect ``` For a concrete app workflow: ``` apps.launch bundleId=... apps.frontmost # verify focus if the next step depends on menus/input snapshot, elements.find, or windows.list # get fresh window/element ids act/menu/windows/clipboard/dialog # one action burst wait or snapshot # verify the expected change ``` ## Targeting Rules - Prefer `bundleId` over `appName` for mutations. Use `apps.search` / `apps.installed` when the app name is uncertain, then retry with `bundleId`. - `appNameMatch` defaults to `exact`. Use `contains` only for read-only discovery or when the user clearly gave a partial name. - Prefer `windowId` from the latest `windows.list` or `snapshot` for window mutations. - `target.windowTitleMatch` defaults to `exact`. Use `contains` only after listing windows and confirming a partial title is intentional. - Prefer `elementId` from the latest `snapshot` / `visual.observe` / `elements.find` for precise clicks and set-value actions, and pass the matching `target.snapshotId` with it. `snapshotId + elementId` lets the runtime verify the original AX fingerprint and re-resolve stale `el_N` ids instead of blindly trusting a new traversal. - Use `elements.find` when a full snapshot is too noisy or when an action target is ambiguous. It is read-only and returns scored candidates with reasons; retry mutations with `target.elementId` from the chosen candidate plus the result `snapshotId`. - If two windows, dialogs, text fields, or buttons match, do not guess. Use a more specific target or ask the user. - Element mutations reject equally ranked AX candidates instead of choosing the first match. When this happens, take a fresh `snapshot` and retry with `elementId`, `target.windowTitle`, `target.role`, or more specific `target.text`. ## Actions ### Apps - Use `apps.frontmost` to know what macOS will receive menu and keyboard actions. - Use `apps.activate bundleId=...` before operating an app that is not frontmost. - Use `apps.search` or `apps.installed` when launch/activate by name fails. - `apps.quit` is destructive. Verify the target app and prefer `bundleId`. ### Dock and Spaces - Use `dock.list` before `dock.launch`; prefer `dockItemId` or `bundleId` over a loose app name. - Use `dock.menu` to open a Dock item's context menu and inspect `menuItems`; use `dock.select_menu` with `menuItem` when possible, or `menuIndex` only when titles are unavailable. If both are present, `menuItem` is treated as the intended target. - `dock.hide` and `dock.show` change the user's Dock autohide setting and restart Dock, so be explicit before approval. - Use `spaces.list` before `spaces.switch` when targeting a numbered Space. `spaceIndex` is 1-based. - `spaces.switch direction="left"|"right"` / `spaceIndex` / `spaceId` pass exactly one selector. Direction and adjacent targets use Mission Control Control+Left/Right first; non-adjacent exact targets may fall back to Control+number or SkyLight/CGS. Verify with `spaces.list` or a fresh screenshot after switching. - `spaces.move_window` moves one explicit window to `spaceIndex` / `spaceId` through SkyLight/CGS. Resolve the window first with `windows.list windowScope="all"` and prefer `windowId`; if post-move verification warns, use `spaces.list` or a fresh screenshot to confirm. ### Windows - Use `windows.list` before `windows.close`, `move`, `resize`, or `minimize` unless the user supplied an exact `windowId`. - `windowScope` defaults to `frontmost`. Use `windows.list windowScope="all"` to discover background app windows before activating or focusing them. - Prefer all-scope ids like `win_<pid>_<index>` for cross-app window mutations; they are safer than generic titles. - For `windows.close`, avoid generic titles like `Untitled` / `未命名` when multiple similar windows exist. Use `windowId`. - Hope Agent's own window cannot be mutated through the Accessibility worker; if the target is Hope Agent itself, explain the limitation. ### Screenshots - Use `snapshot includeScreenshot=true` when visual context matters. - Default screenshots capture the primary display. Use `displayId` from `snapshot.displays` when the user points at a specific monitor. - For a focused-window image, use `snapshot includeScreenshot=true screenshotTarget="window"`. Pass `windowId` from the latest snapshot/list when several windows are possible. - Window screenshot matching uses the current AX window state; if it fails, take a fresh snapshot and retry with a precise `windowId`. ### Elements and Text - Use `elements.find op="find"` before clicking or typing into ambiguous UI. Useful examples: `target.role="AXButton"`, `target.text="Save"`, `target.windowTitle="Untitled"`. - `elements.find` returns `totalMatches` plus candidate `score`, `reasons`, `element`, and `window`. Prefer high-score candidates whose reasons include the user's intended text/role/window. - Browser/WebView snapshots may focus the dominant `AXWebArea` and re-traverse when no text input is exposed. If a result warning mentions this fallback, use the refreshed candidates first; if it still exposes only web/canvas content, switch to `visual.observe annotate=true`, OCR, o
>
Use when the user asks to draft, polish, translate, or reply to an email. Produces a clean draft with subject line, greeting, body, and sign-off, plus a pre-send self-check.
Use when the user mentions 飞书 / Feishu / Lark workspace operations: docx (云文档) read/write, bitable (多维表格) records / views / dashboards, drive (云盘) upload/download, wiki (知识库) link resolution, approval (审批) instance create/cancel/query, calendar (日历) event create/list/update + attendees, contact (联系人) user/department lookup, hire (招聘) job/talent/application listing. Trigger on phrases like 'OKR 周报', '把这份文档发到飞书云盘', '给团队拉个评审会议', '查 [姓名] 的联系方式', '撤销那条审批', '/wiki 链接', or any request that mentions a feishu / lark URL / token (doxcn.../bascn.../wikcn.../boxcn.../om_...).
Hope Agent browser automation — the standard `status → tabs → snapshot → act` loop, stale-ref recovery rules, and what to do when login / 2FA / captcha / camera-prompt / dialog blocks progress. Load this skill whenever you reach for the `browser` tool. Trigger on: user asks the agent to open / control / click / scrape / log into / verify something in a web app ('open X and click Y', '打开 X 然后点击 Y', 'log into my Gmail', 'scrape this page', 'fill out the form on X'); user reports a flow that requires real browser context (cookies, JS-rendered content, OAuth).
Discover and install third-party skills from external registries when the user needs a capability that no currently-active skill covers. Trigger when: (1) the user explicitly asks 'find a skill for X', 'is there a skill that does X', 'install a skill to X', (2) the user requests a well-known integration (Slack, Notion, Trello, GitHub, Hue, Sonos, iMessage, weather, TTS, transcription …) that isn't in the active skill catalog, (3) you are about to hand-write ad-hoc shell / API code for a domain that almost certainly has a published skill. Do NOT trigger if an active skill already covers the need — scan the visible skill catalog first.
Self-service diagnostics — query Hope Agent's local SQLite databases (logs / sessions / async jobs) directly via the `exec` tool to investigate problems, analyze usage, and locate root causes. Trigger on: user reports something broken / failing / slow / stuck / not responding ('X 不工作', 'X 报错', 'X 卡住', '为什么 X 失败', 'why did X fail', 'show me the logs', 'check what happened'); ad-hoc data analysis ('this week's token usage', '最近调用最多的工具', 'how many subagent runs failed', 'tool error rate', 'find sessions where X happened'); verifying a fix ('did the error stop after I changed Y'). Use BEFORE asking the user to paste log snippets — the data is on disk, query it directly. Read-only — SELECT only, never UPDATE/DELETE/INSERT/DROP.
Self-understanding and issue reporting for Hope Agent itself. Use when the user asks how Hope Agent works internally, asks about its own source code/docs/runtime behavior, reports a bug/failure/slowness/crash, asks to diagnose logs, or asks to create/submit a GitHub issue for a bug, feature request, or improvement (including when there is no bug). Chinese triggers: 自查, 了解自己, 自我诊断, 排查 Hope Agent, 提交 issue, 需求 issue, 功能改进.
Check for and install Hope Agent updates through conversation. Use whenever the user asks about upgrades, new versions, release notes, or reports a bug that might already be fixed upstream — phrases like 'upgrade Hope Agent', 'update hope agent', 'check for new version', '升级一下', '有新版本吗', '帮我升级', 'is there a newer build', 'check release notes', 'install the latest'. Also use proactively when an `app_update(action=\"check\")` result shows `has_update: true` and the user hasn't been told yet. Covers all three formfactors: desktop GUI bundle (DMG/MSI/AppImage), `hope-agent server` daemon installed via Homebrew/Scoop/AUR/apt/dnf, and headless single-binary deployments. The upgrade is always user-confirmed via `ask_user_question` — never silent.