computer-use-mcp

Name: kanishka089/computer-use-mcp
Author: kanishka089

MCP server that lets Claude operate your REAL desktop — moves the actual mouse/keyboard and reads the real screen, so it works in your own logged-in Chrome and any app.

MCP ServersRegistry oficial0 estrellas0 forks● PythonMITActualizado today

Install in Claude Code / Claude Desktop

Method: pip / Python · realhands

Claude Code CLI

claude mcp add computer-use-mcp -- python -m realhands

claude_desktop_config.json (Claude Desktop)

{
  "mcpServers": {
    "computer-use-mcp": {
      "command": "python",
      "args": ["-m", "realhands.server"]
    }
  }
}

1. Run the command above in your terminal (Claude Code), or paste the JSON config into claude_desktop_config.json (Claude Desktop).

2. Replace any <placeholder> values with your API keys or paths.

3. Restart Claude. The MCP server and its tools appear automatically.

💡 Install first: pip install realhands

Casos de uso

Dev Tools Automation Productivity

Sobre el repo

Resumen de MCP Servers

<!-- mcp-name: io.github.kanishka089/realhands -->

# computer-use-mcp ("realhands")

An MCP server that lets **Claude operate your real computer** the way a human does —
moving the **actual mouse**, clicking, typing, and reading the **actual screen**.

Unlike OpenAI Operator, browser-use, or Playwright agents (which spin up a separate,
isolated, logged-out Chrome), this drives the **physical OS cursor and keyboard**. So it
works in **your own Chrome with your own logged-in sessions** — and in every other app —
because it's just a human at the keyboard, as far as any website can tell.

**Status: LIVE and battle-tested.** Registered with Claude Code as the user-scope MCP
**`realhands`** (tool `mcp__realhands__computer`) and ✓Connected since 2026-06-02.
On 2026-06-09 it drove the user's real, logged-in Chrome through a **complete Google
Play Console deployment** (app upload, release notes, submission) end-to-end.

> The server is registered as `realhands` rather than `computer-use` because the name
> "computer-use" is reserved in Claude Code.

## How it works

Claude (Desktop or Code) is the agent loop. You type a task; Claude calls the single
`computer` tool in a see → think → act cycle:

> **See** — `screenshot` returns the real screen (downscaled to ~1280px for grounding accuracy)
> → **Think** — Claude picks the next action + pixel coordinates
> → **Act** — the server moves the real mouse / types on the real keyboard
> → a fresh screenshot comes back automatically after every action, and it repeats.

Two Windows-specific details make clicks land accurately (`src/screen.py`):

- **DPI awareness** — `SetProcessDpiAwareness(2)` is set at import time so screenshot
  pixels == pyautogui cursor coordinates even under display scaling (125% / 150% / …).
- **Stateless coordinate scaling** — screenshots are downscaled (LANCZOS) to at most
  `COMPUTER_USE_MAX_DIM` on the longest side before sending; incoming click coordinates
  are scaled back up to real pixels. The scale factor is a pure function of monitor
  geometry + `MAX_DIM`, so mapping never depends on which screenshot ran last.
  Coordinates are clamped inside the target monitor so a stray click can't fly off-screen.

**Multi-monitor:** every call takes an optional `monitor` index (1 = primary, 2.. =
others, 0 = the whole virtual desktop). `action="monitors"` enumerates the setup.
Origins may be negative for screens left/above the primary — `to_real()` handles the
offset. Use the **same** monitor for a click as for the screenshot you're clicking on.

## Architecture

```
src/realhands/
  server.py   FastMCP server "computer-use"; the single `computer` tool (action enum
              modeled on Anthropic's reference computer_20250124 tool); returns
              status text + a fresh screenshot after every action
  screen.py   DPI awareness, mss capture, downscale, model-space -> real-pixel mapping
  input.py    pyautogui mouse/keyboard execution; xdotool-style key-name translation
              (Return, Page_Down, ctrl+a, super, ...); clipboard-paste fast path for
              long/Unicode/multiline typing (preserves your existing clipboard);
              activate_window via win32 AttachThreadInput
  safety.py   kill switches + lazy arm / stand-down lifecycle
  config.py   .env-driven configuration (all defaults are sensible; .env is optional)
install.py    one-shot installer: venv, deps, .env, Claude Desktop registration
```

Stack: Python 3.10/3.11 · `mcp` (FastMCP, stdio) · `pyautogui` · `mss` · `pillow` ·
`pynput` · `keyboard` · `pyperclip` · `python-dotenv` — plus `pygetwindow` and `pywin32`
for `activate_window`.

## The `computer` tool

A single tool with an `action` parameter:

| Action | What it does |
|---|---|
| `screenshot` | Capture the screen (always start a task with this) |
| `cursor_position` | Report the real mouse position |
| `monitors` | List detected monitors (for multi-screen setups) |
| `mouse_move` | Glide the cursor to `coordinate` |
| `left_click` / `right_click` / `middle_click` / `double_click` / `triple_click` | Click at `coordinate` (or current position) |
| `left_click_drag` | Drag from `text="x1,y1"` to `coordinate=[x2,y2]` |
| `left_mouse_down` / `left_mouse_up` | Press / release the left button |
| `scroll` | Scroll at `coordinate` (`scroll_direction` + `scroll_amount` notches) |
| `type` | Type `text` (clipboard-paste path for long/Unicode/multiline) |
| `key` | Press a key or chord — `"Return"`, `"ctrl+s"`, `"alt+Tab"` |
| `hold_key` | Hold keys for `duration` seconds |
| `activate_window` | Bring an app to the front by title substring (beats Windows' foreground-lock; far more reliable than clicking the taskbar) |
| `wait` | Sleep `duration` seconds, then screenshot |
| `stop` | Stand down: close the STOP overlay + release the panic hotkey (call as the final action) |

Coordinates are in the pixel space of the most recent screenshot; its size is reported
with every capture. After every non-screenshot action the tool waits ~0.4s for the UI
to settle and returns a fresh screenshot.

## Safety — it controls your REAL machine

This is **fully autonomous**: it does not ask before each action. Three independent
kill switches (`src/safety.py`):

1. **Fail-safe corner** — slam the mouse into the **top-left corner** → pyautogui raises
   `FailSafeException` and the action aborts instantly.
2. **Panic hotkey** — **Ctrl+Alt+Q** (configurable) → hard-kills the server process
   (`os._exit(1)`).
3. **STOP overlay** — an always-on-top window (top-right) showing the current action,
   with a big red **■ STOP AGENT** button that also hard-kills the process.

**Lazy arm / stand-down:** the overlay and the global panic hotkey are armed lazily on
the **first action** of a task, not at server startup — idle sessions show nothing and
grab no hotkeys. They stand down again when the agent calls `action="stop"`, or
automatically after `COMPUTER_USE_IDLE_STOP` seconds (default 30) of inactivity. The
lightweight stdio process stays connected so the next task is instant. Everything
re-arms automatically on the next action.

Pacing also helps you stay in control: every action is followed by a configurable pause
(`COMPUTER_USE_PAUSE`) and the cursor glides rather than teleports
(`COMPUTER_USE_MOVE_DURATION`), so you can watch and interrupt.

**Don't leave it unsupervised on anything that can spend money, send messages, or
delete data.**

## Install

Requires **Python 3.10 or 3.11** (3.13+ untested; avoid the 3.14 beta).

### From PyPI

```powershell
pip install realhands
```

This installs the `realhands` console script and the importable `realhands`
package. Run the server with either `realhands` or `python -m realhands.server`.

### From source (with Claude Desktop registration)

```powershell
git clone https://github.com/kanishka089/computer-use-mcp
cd computer-use-mcp
py -3.10 install.py
```

This creates `.venv/`, installs the package + deps (editable), copies `.env.example` to
`.env` if missing, and registers the server in Claude Desktop's config (backing up any
existing config). **Restart Claude Desktop**, then look for the `computer-use` tool.

### Claude Code

Register it as a **user-scope** stdio server named `realhands` (pointing at the Python
that has the package installed):

```powershell
claude mcp add realhands --scope user -- python -m realhands.server
```

The tool then appears as `mcp__realhands__computer` in every project.

## Use

Just ask. For example:

> *Take a screenshot, open Chrome, go to YouTube, and search for "lofi".*

Watch your real cursor move and your logged-in Chrome respond. Real-world proof: it has
autonomously completed a full Google Play Console release flow in the user's own
signed-in Chrome session.

## Configuration (`.env`, optional — defaults are fine)

| Var | Default | Meaning |
|-----|---------|---------|
| `COMPUTER_USE_MAX_DIM` | `1280` | Longest screenshot side sent to Claude (sweet spot for accuracy + token cost) |
| `COMPUTER_USE_MONITOR` | `1` | Default monitor (1 = primary, 2.. = others, 0 = all screens); overridable per call |
| `COMPUTER_USE_IMAGE_FORMAT` | `png` | `png` (crisp text) or `jpeg` (cheaper tokens) |
| `COMPUTER_USE_PAUSE` | `0.15` | Delay after each pyautogui action (interruptibility) |
| `COMPUTER_USE_PANIC_HOTKEY` | `ctrl+alt+q` | Global hard-stop hotkey |
| `COMPUTER_USE_OVERLAY` | `1` | Show the STOP overlay window |
| `COMPUTER_USE_MOVE_DURATION` | `0.4` | Cursor glide time (human-like movement) |
| `COMPUTER_USE_IDLE_STOP` | `30` | Auto stand-down after this many idle seconds (0 = never) |

## Known gotchas

- **MCP connection drops when the agent idles between turns.** The stdio connection to
  `realhands` can silently die while Claude is thinking/waiting between turns. Fix: issue
  a `screenshot` action — it silently reconnects. Importantly, an action that "failed"
  with *Connection closed* **often still executed** on the real machine — take a
  screenshot and check the actual screen state before retrying, or you may double-click /
  double-submit.
- **Click coordinates must match the screenshot's monitor.** If you screenshot
  `monitor=2` and then click without passing `monitor=2`, the click lands on the primary.
- **`activate_window` beats the taskbar.** Windows' foreground-lock makes taskbar clicks
  unreliable (the icon just flashes). `activate_window` uses `AttachThreadInput` +
  z-order toggling + a minimize/restore fallback, so prefer it for app switching.
- **Don't run with Python 3.13/3.14.** Tested on 3.10/3.11 only; the installer warns.
- **Typing long/Unicode text uses the clipboard.** Your clipboard is saved and restored,
  but anything watching the clipboard will see the pasted text momentarily.

## Self-test

```powershell
python -m realhands.screen
```

Captures the screen, prints real vs. sent dimensions and the scale factor, writes
`test_capture.png`, and runs a coordinate round-trip check (center + both corners).

Topics

anthropicautomationclaudecomputer-usedesktop-automationmcpmodel-context-protocolrpa

Preguntas frecuentes

Lo que la gente pregunta sobre computer-use-mcp

¿Qué es kanishka089/computer-use-mcp?

kanishka089/computer-use-mcp es mcp servers para el ecosistema de Claude AI. MCP server that lets Claude operate your REAL desktop — moves the actual mouse/keyboard and reads the real screen, so it works in your own logged-in Chrome and any app. Tiene 0 estrellas en GitHub y se actualizó por última vez today.

¿Cómo se instala computer-use-mcp?

Puedes instalar computer-use-mcp clonando el repositorio (https://github.com/kanishka089/computer-use-mcp) o siguiendo las instrucciones del README en GitHub. ClaudeWave también te ofrece bloques de instalación rápida en esta misma página.

¿Es seguro usar kanishka089/computer-use-mcp?

kanishka089/computer-use-mcp aún no ha sido auditado por nuestro agente de seguridad. Revisa el repositorio original en GitHub antes de usarlo en producción.

¿Quién mantiene kanishka089/computer-use-mcp?

kanishka089/computer-use-mcp es mantenido por kanishka089. La última actividad registrada en GitHub es de today, con 0 issues abiertos.

¿Hay alternativas a computer-use-mcp?

Sí. En ClaudeWave puedes explorar mcp servers similares en /categories/mcp, ordenados por popularidad o actividad reciente.

Deploy en 1 click

Despliega computer-use-mcp en tu cloud

Lleva este repo a producción en minutos. Cada plataforma genera su propio entorno con variables de entorno editables.

Vercel Railway Render

Badge embebible

¿Mantienes este repo? Añade un badge a tu README

Pega el badge en tu README de GitHub para mostrar que está auditado por ClaudeWave. Cada badge enlaza de vuelta a esta página y muestra el Trust Score actual.

Markdown (README)

[![Featured on ClaudeWave](https://claudewave.com/api/badge/kanishka089-computer-use-mcp)](https://claudewave.com/repo/kanishka089-computer-use-mcp)

HTML

<a href="https://claudewave.com/repo/kanishka089-computer-use-mcp"><img src="https://claudewave.com/api/badge/kanishka089-computer-use-mcp" alt="Featured on ClaudeWave: kanishka089/computer-use-mcp" width="320" height="64" /></a>

Relacionados

Más MCP Servers

Alternativas a computer-use-mcp

n8n-io

n8n

today

Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.

193k58.6kTypeScript

MCP ServersaiapisInstall

open-webui

yesterday

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

142.1k20.4kPython

MCP ServersaillmInstall

google-gemini

gemini-cli

today

An open-source AI agent that brings the power of Gemini directly into your terminal.

105.4k14.1kTypeScript

MCP Serversaiai-agentsInstall

netdata

today

The fastest path to AI-powered full stack observability, even for lean teams.

79.3k6.5kC

MCP ServersaialertingInstall

D4Vinci

Scrapling

today

🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!

64.6k6.3kPython

MCP Serversaiai-scrapingInstall

sansan0

TrendRadar

5d ago

⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载，你的 AI 舆情监控助手与热点筛选工具！聚合多平台热点 + RSS 订阅，支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机，也支持接入 MCP 架构，赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ，数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。

59.6k24.7kPython

MCP ServersaibarkInstall