Skip to main content
ClaudeWave
Skill145 estrellas del repoactualizado yesterday

Agent Browser Automation

Fast Rust-based headless browser automation CLI with Node.js fallback for AI agents, featuring navigation, clicking, typing, snapshots, and structured commands optimized for agent workflows.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/PramodDutta/qaskills /tmp/agent-browser-automation && cp -r /tmp/agent-browser-automation/seed-skills/agent-browser ~/.claude/skills/agent-browser-automation
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

# Agent Browser Automation Skill

You are an expert in browser automation using agent-browser, a fast Rust-based CLI tool designed specifically for AI coding agents. When the user asks you to automate browser interactions, perform web scraping, or test web applications, follow these detailed instructions.

## Core Principles

1. **Speed-first architecture** -- Rust core provides millisecond-level responsiveness for agent commands.
2. **Structured output** -- All commands return JSON output optimized for AI agent parsing.
3. **Fallback resilience** -- Automatically falls back to Node.js implementation when Rust binary is unavailable.
4. **Agent-optimized commands** -- CLI designed for programmatic control, not human interaction.
5. **Snapshot-driven debugging** -- Every action can capture page snapshots for AI agent analysis.

## Installation

```bash
# Install globally via npm
npm install -g agent-browser

# Or use npx without installation
npx agent-browser --version

# Verify installation
agent-browser --help
```

## Project Structure for Agent-Driven Testing

```
tests/
  browser-automation/
    scenarios/
      login-flow.json
      checkout-flow.json
      search-navigation.json
    snapshots/
      login-success.png
      cart-state.png
    scripts/
      run-scenario.sh
      batch-test.sh
  config/
    browser-config.json
    viewport-sizes.json
```

## Core Commands

### Navigation Commands

```bash
# Navigate to URL
agent-browser navigate --url "https://example.com"

# Navigate with custom viewport
agent-browser navigate --url "https://example.com" --viewport "1920x1080"

# Navigate with custom user agent
agent-browser navigate --url "https://example.com" \
  --user-agent "Mozilla/5.0 (iPhone; CPU iPhone OS 14_0 like Mac OS X)"

# Navigate and wait for network idle
agent-browser navigate --url "https://example.com" --wait-until "networkidle"
```

**Output:**
```json
{
  "success": true,
  "url": "https://example.com",
  "title": "Example Domain",
  "loadTime": 234,
  "timestamp": "2024-02-12T10:30:00Z"
}
```

### Clicking Elements

```bash
# Click by selector
agent-browser click --selector "button#submit"

# Click by text content
agent-browser click --text "Sign In"

# Click with coordinate offset
agent-browser click --selector ".dropdown" --offset "10,20"

# Wait for element before clicking
agent-browser click --selector "button.load-more" --wait 5000
```

**Output:**
```json
{
  "success": true,
  "element": "button#submit",
  "action": "click",
  "timestamp": "2024-02-12T10:30:01Z"
}
```

### Typing Input

```bash
# Type into input field
agent-browser type --selector "input#email" --text "user@example.com"

# Type with delay between keystrokes (human-like)
agent-browser type --selector "input#search" --text "automation" --delay 50

# Type and press Enter
agent-browser type --selector "input#search" --text "query" --enter

# Clear field before typing
agent-browser type --selector "input#username" --text "newuser" --clear
```

**Output:**
```json
{
  "success": true,
  "element": "input#email",
  "action": "type",
  "length": 17,
  "timestamp": "2024-02-12T10:30:02Z"
}
```

### Taking Snapshots

```bash
# Full page screenshot
agent-browser snapshot --output "screenshot.png"

# Snapshot specific element
agent-browser snapshot --selector "#content" --output "content.png"

# Snapshot with custom dimensions
agent-browser snapshot --viewport "1920x1080" --output "desktop.png"

# Get page HTML
agent-browser snapshot --format "html" --output "page.html"

# Get page as PDF
agent-browser snapshot --format "pdf" --output "document.pdf"
```

**Output:**
```json
{
  "success": true,
  "format": "png",
  "path": "screenshot.png",
  "size": 245678,
  "dimensions": {
    "width": 1280,
    "height": 720
  },
  "timestamp": "2024-02-12T10:30:03Z"
}
```

### Extracting Data

```bash
# Extract text from element
agent-browser extract --selector "h1.title" --attribute "text"

# Extract attribute value
agent-browser extract --selector "a.link" --attribute "href"

# Extract multiple elements
agent-browser extract --selector "li.item" --all

# Extract as JSON
agent-browser extract --selector ".product-card" --json \
  --fields "title:.title,price:.price,image:img@src"
```

**Output:**
```json
{
  "success": true,
  "selector": "h1.title",
  "results": [
    {
      "text": "Welcome to Our Site",
      "visible": true
    }
  ],
  "count": 1,
  "timestamp": "2024-02-12T10:30:04Z"
}
```

### Waiting for Elements

```bash
# Wait for element to appear
agent-browser wait --selector ".loading-complete" --timeout 10000

# Wait for element to disappear
agent-browser wait --selector ".spinner" --hidden --timeout 5000

# Wait for text to appear
agent-browser wait --text "Success" --timeout 3000

# Wait for custom condition
agent-browser wait --script "document.readyState === 'complete'"
```

**Output:**
```json
{
  "success": true,
  "condition": "element_visible",
  "selector": ".loading-complete",
  "waited": 1234,
  "timestamp": "2024-02-12T10:30:05Z"
}
```

## Scenario-Based Automation

### Login Flow Example

```bash
#!/bin/bash
# login-test.sh

# Navigate to login page
agent-browser navigate --url "https://app.example.com/login"

# Fill email
agent-browser type --selector "input#email" --text "user@example.com"

# Fill password
agent-browser type --selector "input#password" --text "$PASSWORD"

# Click submit
agent-browser click --selector "button[type='submit']"

# Wait for dashboard
agent-browser wait --selector ".dashboard" --timeout 5000

# Take success snapshot
agent-browser snapshot --output "login-success.png"

# Extract user info
agent-browser extract --selector ".user-name" --attribute "text"
```

### E2E Shopping Flow

```bash
#!/bin/bash
# checkout-flow.sh

set -e  # Exit on any error

echo "Starting checkout flow test..."

# 1. Navigate to product page
agent-browser navigate --url "https://shop.example.com/products/widget-123"
agent-browser wait --selector ".product-details" --timeout 5000

#
axe-core Accessibility AutomationSkill

Automated accessibility testing with axe-core integrated into CI pipelines, including custom rule configuration, issue prioritization, and remediation guidance.

A/B Test ValidationSkill

Validating A/B test implementations including traffic splitting accuracy, statistical significance calculation, metric tracking, and experiment cleanup.

Accessibility A11y EnhancedSkill

Comprehensive WCAG compliance and accessibility testing covering ARIA, keyboard navigation, screen readers, color contrast, and automated a11y validation.

Accessibility AuditorSkill

Comprehensive WCAG 2.1 AA compliance testing combining automated axe-core scans with manual keyboard navigation, screen reader compatibility, and focus management verification

AFL++ Fuzzing TestingSkill

American Fuzzy Lop Plus Plus mutation-based fuzz testing for finding crashes, hangs, and security vulnerabilities in binary programs.

Agentic Testing PatternsSkill

AI-first testing methodology where autonomous agents plan, generate, execute, and maintain test suites with minimal human intervention, covering agent orchestration, feedback loops, and intelligent test prioritization.

AI Agent EvaluationSkill

Comprehensive evaluation patterns for AI agents including multi-turn conversation testing, LLM-as-judge frameworks, benchmark suites, regression detection, and systematic eval pipelines for measuring agent quality and safety.

AI/ML Model TestingSkill

Testing machine learning models including accuracy validation, bias detection, drift monitoring, A/B testing, and model regression testing.