Skip to main content
ClaudeWave
Skill856 estrellas del repoactualizado 11d ago

vc:agent-browser

agent-browser is a context-efficient browser automation CLI designed for AI agents, using a snapshot-and-reference paradigm to reduce token usage by 93% compared to Playwright MCP. Use it for extended autonomous browser sessions, self-verifying workflows, video recording, cloud browser testing via Browserbase, and multi-tab handling where token efficiency and long-running stability matter more than quick one-off interactions.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/withkynam/vibecode-pro-max-kit /tmp/vc-agent-browser && cp -r /tmp/vc-agent-browser/.claude/skills/vc-agent-browser ~/.claude/skills/vc-agent-browser
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

# agent-browser Skill

Browser automation CLI designed for AI agents. Uses "snapshot + refs" paradigm for 93% less context than Playwright MCP.

## Quick Start

```bash
# Install globally
npm install -g agent-browser

# Download Chromium (one-time)
agent-browser install

# Linux: include system deps
agent-browser install --with-deps

# Verify
agent-browser --version
```

## Core Workflow

The 4-step pattern for all browser automation:

```bash
# 1. Navigate
agent-browser open https://example.com

# 2. Snapshot (get interactive elements with refs)
agent-browser snapshot -i
# Output: button "Sign In" @e1, textbox "Email" @e2, ...

# 3. Interact using refs
agent-browser fill @e2 "user@example.com"
agent-browser click @e1

# 4. Re-snapshot after page changes
agent-browser snapshot -i
```

## Project-Specific Setup

For project-specific connection patterns, logged-in session reuse through `chrome-debug`, and when to use `agent-browser` vs `chrome-devtools` vs direct non-browser verification, see the project's browser-automation testing notes in the consuming repo, if present. This skill file stays a generic tool reference only and should not redefine the project's broader testing policy.

---

## When to Use (vs chrome-devtools)

| Use agent-browser | Use chrome-devtools |
|-------------------|---------------------|
| Long autonomous AI sessions | Quick one-off screenshots |
| Context-constrained workflows | Custom Puppeteer scripts needed |
| Video recording for debugging | WebSocket full frame debugging |
| Cloud browsers (Browserbase) | Existing workflow integration |
| Multi-tab handling | Need Sharp auto-compression |
| Self-verifying build loops | Session with auth injection |

**Token efficiency:** ~280 chars/snapshot vs 8K+ for Playwright MCP.

## Command Reference

### Navigation
```bash
agent-browser open <url>       # Navigate to URL
agent-browser back             # Go back
agent-browser forward          # Go forward
agent-browser reload           # Reload page
agent-browser close            # Close browser
```

### Analysis (Snapshot)
```bash
agent-browser snapshot         # Full accessibility tree
agent-browser snapshot -i      # Interactive elements only (recommended)
agent-browser snapshot -c      # Compact output
agent-browser snapshot -d 3    # Limit depth
agent-browser snapshot -s "nav" # Scope to CSS selector
```

### Interactions (use @refs from snapshot)
```bash
agent-browser click @e1        # Click element
agent-browser dblclick @e1     # Double-click
agent-browser fill @e2 "text"  # Clear and fill input
agent-browser type @e2 "text"  # Type without clearing
agent-browser press Enter      # Press key
agent-browser hover @e1        # Hover over element
agent-browser check @e3        # Check checkbox
agent-browser uncheck @e3      # Uncheck checkbox
agent-browser select @e4 "opt" # Select dropdown option
agent-browser scroll @e1       # Scroll element into view
agent-browser scroll down 500  # Scroll page by pixels
agent-browser drag @e1 @e2     # Drag from e1 to e2
agent-browser upload @e5 file.pdf  # Upload file
```

### Information Retrieval
```bash
agent-browser get text @e1     # Get text content
agent-browser get html @e1     # Get HTML
agent-browser get value @e2    # Get input value
agent-browser get attr @e1 href  # Get attribute
agent-browser get title        # Page title
agent-browser get url          # Current URL
agent-browser get count "li"   # Count elements
agent-browser get box @e1      # Bounding box
```

### State Checks
```bash
agent-browser is visible @e1   # Check visibility
agent-browser is enabled @e1   # Check if enabled
agent-browser is checked @e3   # Check if checked
```

### Media
```bash
agent-browser screenshot           # Capture viewport
agent-browser screenshot --full    # Full page
agent-browser screenshot -o ss.png # Save to file
agent-browser pdf -o page.pdf      # Export PDF
agent-browser record start         # Start video recording
agent-browser record stop          # Stop and save video
agent-browser record restart       # Restart recording
```

### Wait Conditions
```bash
agent-browser wait @e1                    # Wait for element
agent-browser wait --text "Success"       # Wait for text to appear
agent-browser wait --url "/dashboard"     # Wait for URL pattern
agent-browser wait --load                 # Wait for page load
agent-browser wait --idle                 # Wait for network idle
agent-browser wait --fn "() => window.ready"  # Wait for JS condition
```

### Browser Configuration
```bash
agent-browser viewport 1920 1080   # Set viewport size
agent-browser device "iPhone 14"   # Emulate device
agent-browser geolocation 40.7 -74.0  # Set geolocation
agent-browser offline true         # Enable offline mode
agent-browser headers '{"X-Custom":"val"}'  # Set headers
agent-browser credentials user pass  # HTTP auth
agent-browser color-scheme dark    # Set color scheme
```

### Storage Management
```bash
agent-browser cookies              # List cookies
agent-browser cookies set name=val # Set cookie
agent-browser cookies clear        # Clear cookies
agent-browser storage local        # Get localStorage
agent-browser storage session      # Get sessionStorage
agent-browser state save auth.json # Save browser state
agent-browser state load auth.json # Load browser state
```

### Network Control
```bash
agent-browser network route "**/*.jpg" --abort    # Block requests
agent-browser network route "**/api/*" --body '{"data":[]}'  # Mock response
agent-browser network unroute "**/*.jpg"          # Remove specific route
agent-browser network requests                    # List intercepted requests
```

### Semantic Finding
```bash
agent-browser find role button           # Find by ARIA role
agent-browser find text "Submit"         # Find by text content
agent-browser find label "Email"         # Find by label
agent-browser find placeholder "Search"  # Find by placeholder
agent-browser find testid "login-btn"    # Find by data-testid
agen
code-reviewerSubagent

Comprehensive code review with scout-based edge case detection. Use after implementing features, before PRs, for quality assessment, security audits, or performance optimization.

code-simplifierSubagent

Simplifies and refines code for clarity, consistency, and maintainability while preserving all functionality. Focuses on recently modified code unless instructed otherwise.

debuggerSubagent

Use this agent when you need to investigate issues, analyze system behavior, diagnose performance problems, examine database structures, collect and analyze logs from servers or CI/CD pipelines, run tests for debugging purposes, or optimize system performance. This includes troubleshooting errors, identifying bottlenecks, analyzing failed deployments, investigating test failures, and creating diagnostic reports. Examples:\n\n<example>\nContext: The user needs to investigate why an API endpoint is returning 500 errors.\nuser: "The /api/users endpoint is throwing 500 errors"\nassistant: "I''ll use the debugger agent to investigate this issue"\n<commentary>\nSince this involves investigating an issue, use the Task tool to launch the debugger agent.\n</commentary>\n</example>\n\n<example>\nContext: The user wants to analyze why the CI/CD pipeline is failing.\nuser: "The GitHub Actions workflow keeps failing on the test step"\nassistant: "Let me use the debugger agent to analyze the CI/CD pipeline logs and identify the issue"\n<commentary>\nThis requires analyzing CI/CD logs and test failures, so use the debugger agent.\n</commentary>\n</example>\n\n<example>\nContext: The user notices performance degradation in the application.\nuser: "The application response times have increased by 300% since yesterday"\nassistant: "I''ll launch the debugger agent to analyze system behavior and identify performance bottlenecks"\n<commentary>\nPerformance analysis and bottleneck identification requires the debugger agent.\n</commentary>\n</example>

execute-agentSubagent

EXECUTE MODE - Implementing EXACTLY what was planned. Full tool access. Can only be invoked after explicit user confirmation. Use after plan is approved.

fast-mode-agentSubagent

FAST MODE - Execute compressed RIPER-5 workflow (RESEARCH + INNOVATE + PLAN) in one session, then pause for EXECUTE confirmation. Use when you want quick end-to-end solution.

git-managerSubagent

Stage, commit, and push code changes with conventional commits. Use when user says "commit", "push", or finishes a feature/fix.

innovate-agentSubagent

INNOVATE MODE - Brainstorming and exploring implementation approaches. Discusses possibilities without making decisions. Use after research is complete.

plan-agentSubagent

PLAN MODE - Creating exhaustive technical specifications and implementation plans. Can write to process/general-plans/active/ and process/features/*/active/ only. Use after approach is decided.