Skip to main content
ClaudeWave
Skill150 repo starsupdated today

BrowserBash Browser Automation

BrowserBash is a vendor-independent CLI tool that automates browser interactions by converting plain-English objectives or Markdown tests into real browser actions, eliminating the need to write selectors or imperative code. Use it to run end-to-end tests locally or on cloud grids like LambdaTest, BrowserStack, or Browserbase, with results streamed as NDJSON and CI-compatible exit codes, powered by local Ollama models or cloud LLMs.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/PramodDutta/qaskills /tmp/browserbash-browser-automation && cp -r /tmp/browserbash-browser-automation/seed-skills/browserbash ~/.claude/skills/browserbash-browser-automation
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# BrowserBash Browser Automation

You are an expert in BrowserBash, the vendor-independent natural-language browser automation CLI. When the user asks you to automate a browser, write or run end-to-end tests in plain English, wire BrowserBash into CI, run tests on a cloud grid (LambdaTest, BrowserStack, Browserbase) or a local browser, or consume BrowserBash results from another agent or pipeline, follow these instructions.

> Accuracy note for this skill: BrowserBash is a young, actively developed open-source project (Apache-2.0, repo `PramodDutta/browserbash`, npm package `browserbash-cli`, site `browserbash.com`). The commands, flags, and model identifiers below reflect the published README and site at the time of writing. Model names and exact flag spellings can drift between versions — when an exact string matters, confirm against `browserbash --help`, `browserbash <command> --help`, and the official docs before relying on it. Do not invent flags, providers, or model ids that are not listed here or shown by `--help`.

## What BrowserBash is (and is not)

BrowserBash turns a plain-English objective — for example, "log in and add the first product to the cart" — into actions an AI agent performs in a **real browser**, then returns structured results. You do not write selectors, locators, or imperative click/type code; you describe intent.

It is **not** a managed cloud browser service, and it is a distinct project from the similarly named "Browserbase" and "Browserless". (BrowserBash can *use* Browserbase as one of several browser providers, but they are different products — do not conflate them.)

Two layers are independently swappable, which is the core mental model:

- **Engine** — who interprets the English and decides the steps.
  - `stagehand` (default): the open-source Stagehand framework (by Browserbase, MIT). Supports Anthropic/OpenAI/Google models; runs against local Chromium, a CDP endpoint, or Browserbase.
  - `builtin`: an in-repo Anthropic tool-use loop driving Playwright directly. Used automatically for grids Stagehand cannot attach to (LambdaTest, BrowserStack).
- **Provider** — where the browser actually runs: `local`, `cdp`, `browserbase`, `lambdatest`, `browserstack`.

Under the hood every provider returns a Playwright `Browser`/`Page`, so BrowserBash is built **on top of Playwright** rather than replacing it — it sits a layer above, replacing the hand-written selector/test code with natural language.

### Why it matters for QA

- **Selector-free E2E**: tests survive UI refactors because there are no CSS/`data-test` selectors to maintain.
- **Committable, reviewable tests**: `*_test.md` files live in the repo and read like a test case.
- **Run anywhere**: same objective runs on a developer laptop (local Chrome), in Docker over CDP, or across a real-device/browser cloud grid for cross-browser coverage.
- **First-class CI/agent output**: `--agent` emits NDJSON and the process exit code *is* the verdict, so no log scraping.
- **Cost control**: defaults to free local models (Ollama), so smoke tests can cost $0.

## Setup and installation

The published CLI installs from npm:

```bash
npm install -g browserbash-cli
browserbash --version
```

To work from source (e.g. to add a custom provider), clone the repo and link it:

```bash
# in a clone of github.com/PramodDutta/browserbash
npm install
npm run build
npm link        # exposes the `browserbash` command
```

Requirements: **Node >= 18**, and **Google Chrome stable** for the default `local` provider. (`ffmpeg` is bundled and used for session video when you record.)

Scaffold a project workspace:

```bash
browserbash init        # creates ./.browserbash/ (tests, variables, config)
```

## Choosing an LLM backend (free-first)

BrowserBash defaults to model `auto`, resolved in this order:

1. **Ollama running locally** → uses your local model (free, open-source, no API key). This is the recommended default.
2. `ANTHROPIC_API_KEY` set → an Anthropic Claude model.
3. `OPENAI_API_KEY` set → an OpenAI model.
4. Otherwise: errors with setup guidance.

The fully free / open-source stack (default engine + local Chromium + Ollama, zero cloud cost, no keys):

```bash
ollama pull qwen3                 # or any tool-capable local model
browserbash run "Open https://example.com and store the heading as 'h1'"
```

Practical tip from the project: very small local models (<= 8B) are flaky on multi-step objectives. A Qwen3 / Llama-3.3-70B-class model interprets multi-step flows far more reliably. Pick the smallest model that completes your flows deterministically.

Cloud LLM options (set the corresponding env var, then optionally pass `--model`):

```bash
# Anthropic (Stagehand or builtin engine)
export ANTHROPIC_API_KEY=sk-ant-...
browserbash run "..."

# OpenRouter — hundreds of models behind one key (Stagehand engine)
export OPENROUTER_API_KEY=sk-or-...
browserbash run "..." --model openrouter/anthropic/claude-sonnet-4-6
```

Important pairing rule: the cloud-grid providers (`lambdatest`, `browserstack`) auto-switch to the **builtin** engine, which speaks the Anthropic API. Pair those runs with `ANTHROPIC_API_KEY` (or an `ANTHROPIC_BASE_URL` gateway such as a LiteLLM proxy). Local Ollama-only setups will not drive those grids — set an Anthropic-compatible backend for grid runs.

Configuration precedence is **flags > env vars > `~/.browserbash/config.json` defaults**. Inspect and set defaults with:

```bash
browserbash config show
browserbash config set defaultProvider lambdatest
browserbash providers          # list available providers
browserbash whoami             # show resolved identity/credentials
```

## Core usage — one-shot objectives

`browserbash run "<objective>" [flags]` runs a single plain-English objective. Two patterns to know:

- **Acting**: "log in and add the first product to the cart" — the agent figures out the clicks and typing.
- **Extracting**: append `... store <value> as 'name'` and the value comes back in th
axe-core Accessibility AutomationSkill

Automated accessibility testing with axe-core integrated into CI pipelines, including custom rule configuration, issue prioritization, and remediation guidance.

A/B Test ValidationSkill

Validating A/B test implementations including traffic splitting accuracy, statistical significance calculation, metric tracking, and experiment cleanup.

Accessibility A11y EnhancedSkill

Comprehensive WCAG compliance and accessibility testing covering ARIA, keyboard navigation, screen readers, color contrast, and automated a11y validation.

Accessibility AuditorSkill

Comprehensive WCAG 2.1 AA compliance testing combining automated axe-core scans with manual keyboard navigation, screen reader compatibility, and focus management verification

AFL++ Fuzzing TestingSkill

American Fuzzy Lop Plus Plus mutation-based fuzz testing for finding crashes, hangs, and security vulnerabilities in binary programs.

Agent Browser AutomationSkill

Fast Rust-based headless browser automation CLI with Node.js fallback for AI agents, featuring navigation, clicking, typing, snapshots, and structured commands optimized for agent workflows.

Agentic Testing PatternsSkill

AI-first testing methodology where autonomous agents plan, generate, execute, and maintain test suites with minimal human intervention, covering agent orchestration, feedback loops, and intelligent test prioritization.

AI Agent EvaluationSkill

Comprehensive evaluation patterns for AI agents including multi-turn conversation testing, LLM-as-judge frameworks, benchmark suites, regression detection, and systematic eval pipelines for measuring agent quality and safety.