Skip to main content
ClaudeWave
Skill522 repo starsupdated today

swarm-local-e2e

This skill provides a structured guide for running end-to-end tests of the agent swarm system locally using a real API server and Docker containers. Use it when you need to verify features through complete workflow testing, or proactively after implementing changes to the API, task lifecycle, session logging, Docker configuration, or UI components. The skill walks through prerequisites, port configuration, database cleanup, API server startup, Docker image building, container orchestration, task creation, log verification, dashboard access, and cleanup procedures.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/desplega-ai/agent-swarm /tmp/swarm-local-e2e && cp -r /tmp/swarm-local-e2e/.claude/skills/swarm-local-e2e ~/.claude/skills/swarm-local-e2e
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Local E2E Testing Guide

Run full end-to-end tests of the agent swarm locally with a real API server and Docker containers.

## When to Use This Skill

This skill should be invoked in two modes:

1. **User-requested QA**: The user asks you to run E2E tests, verify a feature, or QA a specific flow. Follow the steps below targeting what they asked for.

2. **Automated change verification**: After implementing changes that touch the API, runner, polling, task lifecycle, session logs, Docker entrypoint, or worker/lead behavior — use this skill proactively to verify the changes work end-to-end. Determine what's testable based on the diff:
   - **Task lifecycle changes** (poll, runner, store-progress): Create assigned + pool tasks, verify they complete and have correct logs
   - **Session log changes**: Run two sequential tasks on the same agent, verify log isolation (unique sessionIds, no cross-contamination)
   - **Docker / entrypoint changes**: Build image, start containers, verify boot logs and registration
   - **UI changes**: Start the dashboard, use agent-browser/qa-use to verify rendering
   - **API endpoint changes**: Call the endpoint directly and verify the response

You do not need to run every step — pick the subset relevant to the changes being tested.

## Prerequisites

- OrbStack or Docker Desktop running (`open -a OrbStack` if needed)
- `.env` with `API_KEY` and `PORT` configured
- `.env.docker-lead` with lead config (`AGENT_ID`, `CLAUDE_CODE_OAUTH_TOKEN`, `MCP_BASE_URL`)
- `.env.docker` with worker config (`AGENT_ID`, `CLAUDE_CODE_OAUTH_TOKEN` or `OPENROUTER_API_KEY`, `MCP_BASE_URL`)

## Step 1: Determine Your Port

Check `.env` for the configured port — do **not** assume 3013:

```bash
grep ^PORT= .env
```

Use this value as `$PORT` throughout. In worktrees, each worktree may have a different port. Always verify and use the value from `.env`.

Also verify the Docker env files match:
```bash
grep MCP_BASE_URL .env.docker-lead .env.docker
# Both should point to http://host.docker.internal:$PORT
```

If they don't match, update them before starting containers.

## Step 2: Clean DB + Start API Server

```bash
# Kill any existing API process on your port
lsof -ti :$PORT | xargs kill 2>/dev/null

# Clean DB for fresh state
rm -f agent-swarm-db.sqlite agent-swarm-db.sqlite-wal agent-swarm-db.sqlite-shm

# Start API server
bun run start:http &
# Wait ~3s for startup, confirm "MCP HTTP server running on http://localhost:$PORT/mcp"
```

## Step 3: Build Docker Image

```bash
bun run docker:build:worker
```

This builds `agent-swarm-worker:latest` from the current code. **Rebuild after every code change.**

## Step 4: Start Lead Container

Use a **unique container name** to avoid conflicts with other worktrees (e.g. include branch name or feature):

```bash
docker run --rm -d \
  --name e2e-lead-$(git branch --show-current | tr '/' '-') \
  --env-file .env.docker-lead \
  -e AGENT_ROLE=lead \
  -e MAX_CONCURRENT_TASKS=1 \
  -p 3201:3000 \
  agent-swarm-worker:latest
```

Wait ~15s, then verify:
```bash
docker logs e2e-lead-$(git branch --show-current | tr '/' '-') 2>&1 | tail -5
# Should see: "[lead] Polling for triggers (0/1 active)..."
```

If port 3201 is taken by another worktree, pick a different host port (e.g. `-p 3211:3000`).

## Step 5: Start Worker Container

```bash
docker run --rm -d \
  --name e2e-worker-$(git branch --show-current | tr '/' '-') \
  --env-file .env.docker \
  -e MAX_CONCURRENT_TASKS=1 \
  -p 3203:3000 \
  agent-swarm-worker:latest
```

Wait ~15s, then verify:
```bash
docker logs e2e-worker-$(git branch --show-current | tr '/' '-') 2>&1 | tail -5
# Should see: "[worker] Polling for triggers (0/1 active)..."
```

## Step 6: Verify Registration

Use `context-mode execute` (not curl directly due to hook restrictions):

```javascript
const headers = { 'Authorization': 'Bearer $API_KEY', 'Content-Type': 'application/json' };
const agents = await (await fetch('http://localhost:$PORT/api/agents', { headers })).json();
for (const a of agents.agents) {
  console.log(`${a.name} | isLead: ${a.isLead} | status: ${a.status} | id: ${a.id}`);
}
```

Should show both lead and worker registered as `idle`. Save the agent IDs for task creation.

## Step 7: Create Tasks

### Assigned task (picked up by lead)

```javascript
const t = await (await fetch('http://localhost:$PORT/api/tasks', {
  method: 'POST', headers,
  body: JSON.stringify({ task: 'Say hello. Call store-progress with status completed.', agentId: LEAD_ID })
})).json();
console.log('Task:', t.id, '| status:', t.status);
```

**Important**: Use `agentId` (not `assignedTo`) to assign tasks. Wrong param silently creates an unassigned task.

### Pool task (auto-claimed by worker)

```javascript
const t = await (await fetch('http://localhost:$PORT/api/tasks', {
  method: 'POST', headers,
  body: JSON.stringify({ task: 'Say hello. Call store-progress with status completed.' })
})).json();
console.log('Pool task:', t.id, '| status:', t.status);
```

Workers auto-claim unassigned tasks at poll time. Leads do **not** auto-claim pool tasks.

## Step 8: Monitor Progress

```bash
# Watch lead logs (use your container name)
docker logs -f e2e-lead-$(git branch --show-current | tr '/' '-') 2>&1 | tail -20

# Watch worker logs
docker logs -f e2e-worker-$(git branch --show-current | tr '/' '-') 2>&1 | tail -20
```

Poll task status:
```javascript
const t = await (await fetch('http://localhost:$PORT/api/tasks/<task-id>', { headers })).json();
console.log(t.status);  // pending → in_progress → completed/failed
```

## Step 9: Verify Session Logs

```javascript
const logs = await (await fetch('http://localhost:$PORT/api/tasks/<task-id>/session-logs', { headers })).json();
console.log('Log count:', logs.logs.length);
// Should be > 0 for completed tasks
```

For **log isolation** verification (multiple sequential tasks from same agent):
```javascript
const [l1, l2] = await Promise.all([
  fetch('http://localhost:$PORT/api/tasks/<t