hive.error-recovery
hive.error-recovery provides a structured decision tree for handling tool call failures by classifying them as transient or structural, then determining whether to retry, fix input, escalate, or record failure to the database. Use this skill when building resilient agent workflows that must distinguish between recoverable errors and permanent failures, and when tasks need persistent tracking rather than silent drops.
git clone --depth 1 https://github.com/aden-hive/hive /tmp/hive.error-recovery && cp -r /tmp/hive.error-recovery/core/framework/skills/_default_skills/error-recovery ~/.claude/skills/hive.error-recoverySKILL.md
## Operational Protocol: Error Recovery
When a tool call fails:
1. **Diagnose** — classify the failure as *transient* (network blip, rate limit, timeout) or *structural* (wrong selector, missing auth, invalid schema, permission denied).
2. **Decide:**
- Transient → retry once.
- Structural + fixable → fix the input and retry.
- Structural + unfixable → record the failure and move to the next item.
- Blocking all progress → escalate.
3. **Adapt** — if the same tool has failed {{max_retries_per_tool}}+ times in a row, stop using it and find an alternative approach.
**Never silently drop a failed item.** If the item is a task in the colony queue, write the failure to the DB instead of an in-memory buffer:
```bash
sqlite3 "$DB_PATH" "UPDATE tasks SET status='failed', last_error='<one-sentence reason>', completed_at=datetime('now'), updated_at=datetime('now') WHERE id='<task-id>' AND worker_id='<your-worker-id>';"
```
The `tasks.retry_count` column and the stale-claim reclaimer handle auto-retry for crashes; your job is the within-run decision tree above. See `hive.colony-progress-tracker` for the full queue protocol.SOP for debugging browser automation failures on complex websites. Use when browser tools fail on specific sites like LinkedIn, Twitter/X, SPAs, or sites with Shadow DOM.
Claim tasks, record step progress, and verify SOP gates in the colony SQLite queue. Applies when your spawn message includes a db_path field.
Proactively extract critical values from tool results into working notes before automatic context pruning destroys them.
Maintain a free-form scratchpad of decisions, extracted values, and open questions so context pruning doesn't lose anything you still need.
Periodically self-assess output quality to catch degradation before the judge does.
Author a new Agent Skill for a Hive agent that conforms to the Agent Skills specification (SKILL.md with YAML frontmatter, optional scripts/references/assets directories). Use when the user asks to create, scaffold, add, or package a new skill for a Hive agent.