Skip to main content
ClaudeWave

acrawl — LLM-powered web crawler. Describe what you want in plain English, get structured data back. Single Rust binary, 25 providers, MCP server built-in.

MCP ServersRegistry oficial5 estrellas0 forksRustMITActualizado today
ClaudeWave Trust Score
87/100
Trusted
Passed
  • Open-source license (MIT)
  • Actively maintained (<30d)
  • Clear description
  • Topics declared
Last scanned: 6/11/2026
Install in Claude Code / Claude Desktop
Method: Manual · AgenticCrawler
Claude Code CLI
git clone https://github.com/Mingye-Lu/AgenticCrawler
claude_desktop_config.json (Claude Desktop)
{
  "mcpServers": {
    "agenticcrawler": {
      "command": "AgenticCrawler"
    }
  }
}
1. Run the command above in your terminal (Claude Code), or paste the JSON config into claude_desktop_config.json (Claude Desktop).
2. Replace any <placeholder> values with your API keys or paths.
3. Restart Claude. The MCP server and its tools appear automatically.
💡 Install the binary first: cargo install AgenticCrawler (or build from https://github.com/Mingye-Lu/AgenticCrawler).
Casos de uso

Resumen de MCP Servers

<p align="center">
<pre align="center">
  █████╗  ██████╗██████╗  █████╗ ██╗    ██╗██╗     
 ██╔══██╗██╔════╝██╔══██╗██╔══██╗██║    ██║██║     
 ███████║██║     ██████╔╝███████║██║ █╗ ██║██║     
 ██╔══██║██║     ██╔══██╗██╔══██║██║███╗██║██║     
 ██║  ██║╚██████╗██║  ██║██║  ██║╚███╔███╔╝███████╗
 ╚═╝  ╚═╝ ╚═════╝╚═╝  ╚═╝╚═╝  ╚═╝ ╚══╝╚══╝ ╚══════╝
</pre>
</p>

<p align="center">
  <strong>LLM-powered web crawler.</strong> Describe what you want in plain English — get structured data back.
</p>

<p align="center">
  <a href="https://github.com/Mingye-Lu/AgenticCrawler/actions/workflows/ci.yml"><img src="https://github.com/Mingye-Lu/AgenticCrawler/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
  <a href="LICENSE"><img src="https://img.shields.io/badge/license-MIT-blue.svg" alt="License: MIT"></a>
  <a href="https://www.rust-lang.org/"><img src="https://img.shields.io/badge/rust-2021_edition-orange.svg" alt="Rust"></a>
</p>

<p align="center">
  Single binary. No Python runtime. 29 tools. 25 LLM providers. MCP server built-in.
</p>

---

## Why acrawl?

Most web scraping still means writing code: XPath selectors, pagination logic, retry handling, anti-bot workarounds. LLMs can read pages like humans do, but wiring one up to a browser is a project in itself.

acrawl is that wiring, packaged as a single Rust binary. You describe a goal; the agent figures out which pages to visit, what to click, what to extract, and when it's done.

- **No code required.** Describe the goal in English. The agent plans and executes.
- **One binary, zero runtimes.** `cargo build --release` produces a self-contained executable. No Python, no Node runtime — just Rust and a Chromium download for browser automation.
- **Smart fetching.** Static pages are served over HTTP (fast). When JavaScript or interaction is needed, acrawl detects JS framework markers (`__next_data__`, `__nuxt`, `__vue`, `ng-app`, React roots), auth redirects, and short `<noscript>` bodies — then transparently escalates to a headless browser.
- **29 tools, not a chatbot.** The agent has real tools — navigate, click, fill forms, run JS, take screenshots, switch device emulation, manage tabs, run deterministic scripts — plus a fork/join layer to spawn parallel sub-agents across multiple browser tabs.
- **25 LLM providers.** Anthropic, OpenAI, Google Gemini, DeepSeek, AWS Bedrock, Azure OpenAI, Vertex AI, GitHub Copilot, Groq, Mistral, xAI, Cohere, Alibaba DashScope, OpenRouter, and more. Or bring your own via any OpenAI-compatible endpoint.
- **MCP client.** Extend the agent with custom tools via [Model Context Protocol](https://modelcontextprotocol.io) servers (stdio, SSE, HTTP, WebSocket).
- **MCP server.** `acrawl mcp` exposes 25 browser tools plus an autonomous `run_goal` agent to any MCP-compatible client — Claude Code, Cursor, Windsurf, VS Code, Zed, JetBrains, TRAE, Gemini CLI, and more. Install with `acrawl mcp install`.

### How does it compare?

#### vs. AI web agents and scraping tools

| | acrawl | browser-use | Stagehand | Skyvern | Firecrawl | Playwright MCP | Scrapy | Playwright scripts |
|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
| No code needed | Yes | No | No | Partial | No | No | No | No |
| Single binary | Yes | No | No | No | No | No | No | No |
| JS rendering | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes |
| LLM-powered navigation | Yes | Yes | Yes | Yes | Limited | No | No | No |
| No Python / Node needed | Yes | No | No | No | No | No | No | No |
| Form filling / interaction | Yes | Yes | Yes | Yes | No | Yes | No | Yes |
| Sub-agent parallelism | Yes | No | No | Partial | Partial | No | Partial | No |
| 25 LLM providers | Yes | Via LiteLLM | Partial | Partial | N/A | N/A | N/A | N/A |
| MCP client (use tools) | Yes | No | No | No | No | No | No | No |
| MCP server (expose as tools) | Yes | No | No | No | Yes | Yes | No | No |
| Stealth browser built-in | Yes | Cloud only | Via Browserbase | Cloud only | No | No | No | No |
| Open source | Yes | Yes (MIT) | Yes (MIT) | Yes (Apache) | Engine only | Yes (MIT) | Yes (BSD) | Yes (Apache) |

Notes:
- **browser-use** (85k+ GitHub stars): Python + Playwright, DOM + screenshots, supports GPT/Claude/Gemini/Ollama via LiteLLM, 89.1% WebVoyager. No single binary — requires Python and `pip install`. Every action calls an LLM: 2-5s/step, ~$0.02-0.30/task. Cloud tier adds stealth; self-hosted is bare Playwright.
- **Stagehand** (Browserbase, 21k+ stars): TypeScript + CDP (v3), mixes deterministic Playwright with AI primitives (`act()`, `extract()`, `observe()`). Action caching reuses successful clicks without re-calling the LLM. Requires Node and, for production, Browserbase cloud hosting.
- **Skyvern** (21k+ stars, Apache 2.0): vision-first (screenshot-only, no DOM), handles legacy portals and government forms that DOM tools struggle with. No-code cloud UI available. Each step costs vision-model tokens — ~$0.10-0.50/task. 85.85% WebVoyager.
- **Firecrawl** (82k+ stars): managed scraping API. Returns LLM-ready Markdown, JSON extraction, site-wide crawl. Not an agentic tool — minimal multi-step interaction. Ships an official MCP server. Per-page pricing from $19/month.
- **Playwright MCP** (Microsoft, 29k+ stars): MCP server that exposes browser control via the accessibility tree. Sub-100ms actions, zero vision tokens. Drives an LLM client's browser rather than having its own reasoning — no autonomous goal navigation. Used in GitHub Copilot Agent.

#### vs. native LLM provider browsing

Most AI providers offer some form of browsing, but it is designed for **conversational information retrieval**, not programmatic web automation. Key constraints:

| | acrawl | ChatGPT Agent | Claude Computer Use | Claude in Chrome | Gemini Deep Research | Copilot / Edge |
|---|:---:|:---:|:---:|:---:|:---:|:---:|
| Real JS-rendered browser | Yes | Yes (sandboxed cloud VM) | Indirect (dev provides env) | Yes (your Chrome) | No (search API only) | Limited (Bing retrieval) |
| Click / fill forms | Yes | Yes (requires user confirmation) | Yes | Yes | No | Limited |
| Programmable / scriptable | Yes | No | Yes (API beta) | No | No | No |
| Sub-agent parallelism | Yes | No | No | No | No | No |
| MCP server (expose as tools) | Yes | No | No | No | No | No |
| Returns structured data | Yes | No (text summaries) | No (screenshots) | No | No | No |
| Stealth / anti-bot | Yes | No | No | No | No | No |
| No vendor lock-in | Yes (25 providers) | OpenAI only | Anthropic only | Anthropic only | Google only | OpenAI / Bing only |
| Runs without paid subscription | Yes (OSS; LLM key needed) | No (Plus/Pro/Business) | No (API cost) | No (Max plan) | Partial | Yes (free tier) |

Notes:
- **ChatGPT Agent** (OpenAI, July 2025): runs in a sandboxed cloud virtual machine with its own Chromium instance. Can browse, click, and fill forms but pauses for user confirmation on sensitive actions (purchases, logins). Uses two modes: a fast text browser for research queries and a visual browser for interaction. Cannot run code in the browser, install extensions, or access your local file system. Susceptible to prompt injection. Available to Plus/Pro/Business subscribers.
- **ChatGPT Atlas** (OpenAI, October 2025): a full Chromium browser with ChatGPT integrated as a sidebar + agent. Agent mode drives the same sandboxed cloud VM as ChatGPT Agent; core limitations are identical.
- **Claude Computer Use** (Anthropic API, beta since October 2024): screenshot + mouse/keyboard API for any desktop application, not just browsers. Vision-only — no DOM access. Developers must provide and manage the entire computing environment (typically a Docker container with Xvfb + Firefox). Not a ready-to-use binary. Requires significant infrastructure to operate in production.
- **Claude in Chrome** (Anthropic Chrome extension, beta November 2025+): lets Claude operate within your existing Chrome session using your real cookies and logins. Available to Max plan subscribers. Not an open API — no programmatic control. Good for interactive personal tasks; not suitable for batch automation.
- **Gemini / Deep Research** (Google): browsing is grounded via Google Search API calls, not a live browser session. Deep Research synthesizes across many searches but cannot interact with pages (click, fill forms, navigate dynamically). Project Mariner (experimental computer use) is a separate, limited research preview.
- **Copilot / Edge** (Microsoft): Edge's Copilot Mode uses Bing retrieval with some ability to navigate pages. Real-world tests show high latency (6+ minutes for multi-page comparison tasks) and frequent interruptions for user confirmation. Not a developer API.

## Quick Start

### Install

**Linux / macOS (x64 / ARM64):**
```bash
curl -fsSL https://raw.githubusercontent.com/Mingye-Lu/AgenticCrawler/main/install.sh | bash
```

**Windows (x64, PowerShell):**
```powershell
irm https://raw.githubusercontent.com/Mingye-Lu/AgenticCrawler/main/install.ps1 | iex
```

This downloads the latest binary, verifies its SHA256 checksum, and sets up CloakBrowser for stealth browser automation. Requires Node.js 20+ for browser features.

acrawl checks for updates on startup and shows a notification when a new version is available.

<details>
<summary>Build from source</summary>

```bash
git clone https://github.com/Mingye-Lu/AgenticCrawler.git
cd AgenticCrawler
cargo build --release

# Install CloakBrowser (required for browser automation — binary auto-downloads on first use)
npm install
```

</details>

### Browser Extension (optional)

The acrawl Bridge extension lets acrawl control your real browser (with your sessions, cookies, and existing extensions) instead of a headless CloakBrowser instance. Download `acrawl-extension.zip` from the [latest release](https://github.com/Mingye-Lu/AgenticCrawler/releases/latest), unzip it, then load it into your browser:

| Browser | Extensions page | Developer mode toggle |
|---------|----------------|----------------------|
| Chrome | `c
ai-agentautonomous-agentsbrowser-automationclaudeclicloakbrowserdeveloper-toolsheadless-browserllmmcpmcp-clientmcp-servermodel-context-protocolopenaiplaywrightrustweb-crawlerweb-scraping

Lo que la gente pregunta sobre AgenticCrawler

¿Qué es Mingye-Lu/AgenticCrawler?

+

Mingye-Lu/AgenticCrawler es mcp servers para el ecosistema de Claude AI. acrawl — LLM-powered web crawler. Describe what you want in plain English, get structured data back. Single Rust binary, 25 providers, MCP server built-in. Tiene 5 estrellas en GitHub y se actualizó por última vez today.

¿Cómo se instala AgenticCrawler?

+

Puedes instalar AgenticCrawler clonando el repositorio (https://github.com/Mingye-Lu/AgenticCrawler) o siguiendo las instrucciones del README en GitHub. ClaudeWave también te ofrece bloques de instalación rápida en esta misma página.

¿Es seguro usar Mingye-Lu/AgenticCrawler?

+

Nuestro agente de seguridad ha analizado Mingye-Lu/AgenticCrawler y le ha asignado un Trust Score de 87/100 (tier: Trusted). Revisa el desglose completo de comprobaciones superadas y flags en esta página.

¿Quién mantiene Mingye-Lu/AgenticCrawler?

+

Mingye-Lu/AgenticCrawler es mantenido por Mingye-Lu. La última actividad registrada en GitHub es de today, con 3 issues abiertos.

¿Hay alternativas a AgenticCrawler?

+

Sí. En ClaudeWave puedes explorar mcp servers similares en /categories/mcp, ordenados por popularidad o actividad reciente.

Despliega AgenticCrawler en tu cloud

Lleva este repo a producción en minutos. Cada plataforma genera su propio entorno con variables de entorno editables.

¿Mantienes este repo? Añade un badge a tu README

Pega el badge en tu README de GitHub para mostrar que está auditado por ClaudeWave. Cada badge enlaza de vuelta a esta página y muestra el Trust Score actual.

Featured on ClaudeWave: Mingye-Lu/AgenticCrawler
[![Featured on ClaudeWave](https://claudewave.com/api/badge/mingye-lu-agenticcrawler)](https://claudewave.com/repo/mingye-lu-agenticcrawler)
<a href="https://claudewave.com/repo/mingye-lu-agenticcrawler"><img src="https://claudewave.com/api/badge/mingye-lu-agenticcrawler" alt="Featured on ClaudeWave: Mingye-Lu/AgenticCrawler" width="320" height="64" /></a>

Más MCP Servers

Alternativas a AgenticCrawler