computer-use-mcp

Name: kanishka089/computer-use-mcp
Author: kanishka089

MCP server that lets Claude operate your REAL desktop — moves the actual mouse/keyboard and reads the real screen, so it works in your own logged-in Chrome and any app.

MCP ServersOfficial Registry0 stars0 forks● PythonMITUpdated today

Install in Claude Code / Claude Desktop

Method: pip / Python · realhands

Claude Code CLI

claude mcp add computer-use-mcp -- python -m realhands

claude_desktop_config.json (Claude Desktop)

{
  "mcpServers": {
    "computer-use-mcp": {
      "command": "python",
      "args": ["-m", "realhands.server"]
    }
  }
}

1. Run the command above in your terminal (Claude Code), or paste the JSON config into claude_desktop_config.json (Claude Desktop).

2. Replace any <placeholder> values with your API keys or paths.

3. Restart Claude. The MCP server and its tools appear automatically.

💡 Install first: pip install realhands

Use cases

Dev Tools Automation Productivity

About

MCP Servers overview

<!-- mcp-name: io.github.kanishka089/realhands -->

# computer-use-mcp ("realhands")

An MCP server that lets **Claude operate your real computer** the way a human does —
moving the **actual mouse**, clicking, typing, and reading the **actual screen**.

Unlike OpenAI Operator, browser-use, or Playwright agents (which spin up a separate,
isolated, logged-out Chrome), this drives the **physical OS cursor and keyboard**. So it
works in **your own Chrome with your own logged-in sessions** — and in every other app —
because it's just a human at the keyboard, as far as any website can tell.

**Status: LIVE and battle-tested.** Registered with Claude Code as the user-scope MCP
**`realhands`** (tool `mcp__realhands__computer`) and ✓Connected since 2026-06-02.
On 2026-06-09 it drove the user's real, logged-in Chrome through a **complete Google
Play Console deployment** (app upload, release notes, submission) end-to-end.

> The server is registered as `realhands` rather than `computer-use` because the name
> "computer-use" is reserved in Claude Code.

## How it works

Claude (Desktop or Code) is the agent loop. You type a task; Claude calls the single
`computer` tool in a see → think → act cycle:

> **See** — `screenshot` returns the real screen (downscaled to ~1280px for grounding accuracy)
> → **Think** — Claude picks the next action + pixel coordinates
> → **Act** — the server moves the real mouse / types on the real keyboard
> → a fresh screenshot comes back automatically after every action, and it repeats.

Two Windows-specific details make clicks land accurately (`src/screen.py`):

- **DPI awareness** — `SetProcessDpiAwareness(2)` is set at import time so screenshot
  pixels == pyautogui cursor coordinates even under display scaling (125% / 150% / …).
- **Stateless coordinate scaling** — screenshots are downscaled (LANCZOS) to at most
  `COMPUTER_USE_MAX_DIM` on the longest side before sending; incoming click coordinates
  are scaled back up to real pixels. The scale factor is a pure function of monitor
  geometry + `MAX_DIM`, so mapping never depends on which screenshot ran last.
  Coordinates are clamped inside the target monitor so a stray click can't fly off-screen.

**Multi-monitor:** every call takes an optional `monitor` index (1 = primary, 2.. =
others, 0 = the whole virtual desktop). `action="monitors"` enumerates the setup.
Origins may be negative for screens left/above the primary — `to_real()` handles the
offset. Use the **same** monitor for a click as for the screenshot you're clicking on.

## Architecture

```
src/realhands/
  server.py   FastMCP server "computer-use"; the single `computer` tool (action enum
              modeled on Anthropic's reference computer_20250124 tool); returns
              status text + a fresh screenshot after every action
  screen.py   DPI awareness, mss capture, downscale, model-space -> real-pixel mapping
  input.py    pyautogui mouse/keyboard execution; xdotool-style key-name translation
              (Return, Page_Down, ctrl+a, super, ...); clipboard-paste fast path for
              long/Unicode/multiline typing (preserves your existing clipboard);
              activate_window via win32 AttachThreadInput
  safety.py   kill switches + lazy arm / stand-down lifecycle
  config.py   .env-driven configuration (all defaults are sensible; .env is optional)
install.py    one-shot installer: venv, deps, .env, Claude Desktop registration
```

Stack: Python 3.10/3.11 · `mcp` (FastMCP, stdio) · `pyautogui` · `mss` · `pillow` ·
`pynput` · `keyboard` · `pyperclip` · `python-dotenv` — plus `pygetwindow` and `pywin32`
for `activate_window`.

## The `computer` tool

A single tool with an `action` parameter:

| Action | What it does |
|---|---|
| `screenshot` | Capture the screen (always start a task with this) |
| `cursor_position` | Report the real mouse position |
| `monitors` | List detected monitors (for multi-screen setups) |
| `mouse_move` | Glide the cursor to `coordinate` |
| `left_click` / `right_click` / `middle_click` / `double_click` / `triple_click` | Click at `coordinate` (or current position) |
| `left_click_drag` | Drag from `text="x1,y1"` to `coordinate=[x2,y2]` |
| `left_mouse_down` / `left_mouse_up` | Press / release the left button |
| `scroll` | Scroll at `coordinate` (`scroll_direction` + `scroll_amount` notches) |
| `type` | Type `text` (clipboard-paste path for long/Unicode/multiline) |
| `key` | Press a key or chord — `"Return"`, `"ctrl+s"`, `"alt+Tab"` |
| `hold_key` | Hold keys for `duration` seconds |
| `activate_window` | Bring an app to the front by title substring (beats Windows' foreground-lock; far more reliable than clicking the taskbar) |
| `wait` | Sleep `duration` seconds, then screenshot |
| `stop` | Stand down: close the STOP overlay + release the panic hotkey (call as the final action) |

Coordinates are in the pixel space of the most recent screenshot; its size is reported
with every capture. After every non-screenshot action the tool waits ~0.4s for the UI
to settle and returns a fresh screenshot.

## Safety — it controls your REAL machine

This is **fully autonomous**: it does not ask before each action. Three independent
kill switches (`src/safety.py`):

1. **Fail-safe corner** — slam the mouse into the **top-left corner** → pyautogui raises
   `FailSafeException` and the action aborts instantly.
2. **Panic hotkey** — **Ctrl+Alt+Q** (configurable) → hard-kills the server process
   (`os._exit(1)`).
3. **STOP overlay** — an always-on-top window (top-right) showing the current action,
   with a big red **■ STOP AGENT** button that also hard-kills the process.

**Lazy arm / stand-down:** the overlay and the global panic hotkey are armed lazily on
the **first action** of a task, not at server startup — idle sessions show nothing and
grab no hotkeys. They stand down again when the agent calls `action="stop"`, or
automatically after `COMPUTER_USE_IDLE_STOP` seconds (default 30) of inactivity. The
lightweight stdio process stays connected so the next task is instant. Everything
re-arms automatically on the next action.

Pacing also helps you stay in control: every action is followed by a configurable pause
(`COMPUTER_USE_PAUSE`) and the cursor glides rather than teleports
(`COMPUTER_USE_MOVE_DURATION`), so you can watch and interrupt.

**Don't leave it unsupervised on anything that can spend money, send messages, or
delete data.**

## Install

Requires **Python 3.10 or 3.11** (3.13+ untested; avoid the 3.14 beta).

### From PyPI

```powershell
pip install realhands
```

This installs the `realhands` console script and the importable `realhands`
package. Run the server with either `realhands` or `python -m realhands.server`.

### From source (with Claude Desktop registration)

```powershell
git clone https://github.com/kanishka089/computer-use-mcp
cd computer-use-mcp
py -3.10 install.py
```

This creates `.venv/`, installs the package + deps (editable), copies `.env.example` to
`.env` if missing, and registers the server in Claude Desktop's config (backing up any
existing config). **Restart Claude Desktop**, then look for the `computer-use` tool.

### Claude Code

Register it as a **user-scope** stdio server named `realhands` (pointing at the Python
that has the package installed):

```powershell
claude mcp add realhands --scope user -- python -m realhands.server
```

The tool then appears as `mcp__realhands__computer` in every project.

## Use

Just ask. For example:

> *Take a screenshot, open Chrome, go to YouTube, and search for "lofi".*

Watch your real cursor move and your logged-in Chrome respond. Real-world proof: it has
autonomously completed a full Google Play Console release flow in the user's own
signed-in Chrome session.

## Configuration (`.env`, optional — defaults are fine)

| Var | Default | Meaning |
|-----|---------|---------|
| `COMPUTER_USE_MAX_DIM` | `1280` | Longest screenshot side sent to Claude (sweet spot for accuracy + token cost) |
| `COMPUTER_USE_MONITOR` | `1` | Default monitor (1 = primary, 2.. = others, 0 = all screens); overridable per call |
| `COMPUTER_USE_IMAGE_FORMAT` | `png` | `png` (crisp text) or `jpeg` (cheaper tokens) |
| `COMPUTER_USE_PAUSE` | `0.15` | Delay after each pyautogui action (interruptibility) |
| `COMPUTER_USE_PANIC_HOTKEY` | `ctrl+alt+q` | Global hard-stop hotkey |
| `COMPUTER_USE_OVERLAY` | `1` | Show the STOP overlay window |
| `COMPUTER_USE_MOVE_DURATION` | `0.4` | Cursor glide time (human-like movement) |
| `COMPUTER_USE_IDLE_STOP` | `30` | Auto stand-down after this many idle seconds (0 = never) |

## Known gotchas

- **MCP connection drops when the agent idles between turns.** The stdio connection to
  `realhands` can silently die while Claude is thinking/waiting between turns. Fix: issue
  a `screenshot` action — it silently reconnects. Importantly, an action that "failed"
  with *Connection closed* **often still executed** on the real machine — take a
  screenshot and check the actual screen state before retrying, or you may double-click /
  double-submit.
- **Click coordinates must match the screenshot's monitor.** If you screenshot
  `monitor=2` and then click without passing `monitor=2`, the click lands on the primary.
- **`activate_window` beats the taskbar.** Windows' foreground-lock makes taskbar clicks
  unreliable (the icon just flashes). `activate_window` uses `AttachThreadInput` +
  z-order toggling + a minimize/restore fallback, so prefer it for app switching.
- **Don't run with Python 3.13/3.14.** Tested on 3.10/3.11 only; the installer warns.
- **Typing long/Unicode text uses the clipboard.** Your clipboard is saved and restored,
  but anything watching the clipboard will see the pasted text momentarily.

## Self-test

```powershell
python -m realhands.screen
```

Captures the screen, prints real vs. sent dimensions and the scale factor, writes
`test_capture.png`, and runs a coordinate round-trip check (center + both corners).

Topics

anthropicautomationclaudecomputer-usedesktop-automationmcpmodel-context-protocolrpa

Frequently asked

What people ask about computer-use-mcp

What is kanishka089/computer-use-mcp?

kanishka089/computer-use-mcp is mcp servers for the Claude AI ecosystem. MCP server that lets Claude operate your REAL desktop — moves the actual mouse/keyboard and reads the real screen, so it works in your own logged-in Chrome and any app. It has 0 GitHub stars and was last updated today.

How do I install computer-use-mcp?

You can install computer-use-mcp by cloning the repository (https://github.com/kanishka089/computer-use-mcp) or following the README instructions on GitHub. ClaudeWave also provides quick install blocks on this page.

Is kanishka089/computer-use-mcp safe to use?

kanishka089/computer-use-mcp has not been audited yet by our security agent. Review the original repository on GitHub before using it in production.

Who maintains kanishka089/computer-use-mcp?

kanishka089/computer-use-mcp is maintained by kanishka089. The last recorded GitHub activity is from today, with 0 open issues.

Are there alternatives to computer-use-mcp?

Yes. On ClaudeWave you can browse similar mcp servers at /categories/mcp, sorted by popularity or recent activity.

1-click deploy

Deploy computer-use-mcp to your cloud

Ship this repo to production in minutes. Each platform spins up its own environment with editable env vars.

Vercel Railway Render

Embeddable badge

Maintain this repo? Add a badge to your README

Drop the badge into your GitHub README to show it's tracked on ClaudeWave. Each badge links back to this page and reflects the live Trust Score.

Markdown (README)

[![Featured on ClaudeWave](https://claudewave.com/api/badge/kanishka089-computer-use-mcp)](https://claudewave.com/repo/kanishka089-computer-use-mcp)

HTML

<a href="https://claudewave.com/repo/kanishka089-computer-use-mcp"><img src="https://claudewave.com/api/badge/kanishka089-computer-use-mcp" alt="Featured on ClaudeWave: kanishka089/computer-use-mcp" width="320" height="64" /></a>

More MCP Servers

computer-use-mcp alternatives

n8n-io

n8n

today

Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.

193k58.6kTypeScript

MCP ServersaiapisInstall

open-webui

yesterday

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

142.1k20.4kPython

MCP ServersaillmInstall

google-gemini

gemini-cli

today

An open-source AI agent that brings the power of Gemini directly into your terminal.

105.4k14.1kTypeScript

MCP Serversaiai-agentsInstall

netdata

today

The fastest path to AI-powered full stack observability, even for lean teams.

79.3k6.5kC

MCP ServersaialertingInstall

D4Vinci

Scrapling

today

🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!

64.6k6.3kPython

MCP Serversaiai-scrapingInstall

sansan0

TrendRadar

5d ago

⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载，你的 AI 舆情监控助手与热点筛选工具！聚合多平台热点 + RSS 订阅，支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机，也支持接入 MCP 架构，赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ，数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。

59.6k24.7kPython

MCP ServersaibarkInstall