Skip to main content
ClaudeWave
Skill10.5k estrellas del repoactualizado 14d ago

hive.error-recovery

hive.error-recovery provides a structured decision tree for handling tool call failures by classifying them as transient or structural, then determining whether to retry, fix input, escalate, or record failure to the database. Use this skill when building resilient agent workflows that must distinguish between recoverable errors and permanent failures, and when tasks need persistent tracking rather than silent drops.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/aden-hive/hive /tmp/hive.error-recovery && cp -r /tmp/hive.error-recovery/core/framework/skills/_default_skills/error-recovery ~/.claude/skills/hive.error-recovery
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

## Operational Protocol: Error Recovery

When a tool call fails:

1. **Diagnose** — classify the failure as *transient* (network blip, rate limit, timeout) or *structural* (wrong selector, missing auth, invalid schema, permission denied).

2. **Decide:**
   - Transient → retry once.
   - Structural + fixable → fix the input and retry.
   - Structural + unfixable → record the failure and move to the next item.
   - Blocking all progress → escalate.

3. **Adapt** — if the same tool has failed {{max_retries_per_tool}}+ times in a row, stop using it and find an alternative approach.

**Never silently drop a failed item.** If the item is a task in the colony queue, write the failure to the DB instead of an in-memory buffer:

```bash
sqlite3 "$DB_PATH" "UPDATE tasks SET status='failed', last_error='<one-sentence reason>', completed_at=datetime('now'), updated_at=datetime('now') WHERE id='<task-id>' AND worker_id='<your-worker-id>';"
```

The `tasks.retry_count` column and the stale-claim reclaimer handle auto-retry for crashes; your job is the within-run decision tree above. See `hive.colony-progress-tracker` for the full queue protocol.