Skip to main content
ClaudeWave
Skill510 repo starsupdated today

Heartbeat

Heartbeat proactively monitors scheduled skill health by reading cron state and recent logs to surface failures, stuck processes, API degradation, chronic underperformance, and stalled pull requests in priority order. Use it to detect system degradation before users report issues, with optional focus on specific areas via variable input.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/aaronjmars/aeon /tmp/heartbeat && cp -r /tmp/heartbeat/skills/heartbeat ~/.claude/skills/heartbeat
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

> **${var}** — Area to focus on. If empty, runs all checks.

If `${var}` is set, focus checks on that specific area.

Read memory/MEMORY.md and the last 2 days of memory/logs/ for context.

## Checks (in priority order)

### P0 — Failed & stuck skills (check first)

Read `memory/cron-state.json`. This file tracks every scheduled skill's state and quality metrics:
```json
{
  "skill-name": {
    "last_dispatch": "2026-04-06T12:00:00Z",
    "last_status": "dispatched|success|failed",
    "last_success": "2026-04-06T12:05:00Z",
    "last_failed": "2026-04-05T12:03:00Z",
    "total_runs": 10,
    "total_successes": 8,
    "total_failures": 2,
    "consecutive_failures": 0,
    "success_rate": 0.80,
    "last_quality_score": 4,
    "last_error": "error signature text"
  }
}
```

Flag these conditions:
- **Failed skills**: any entry with `last_status: "failed"`. Report the skill name and when it failed.
- **Stuck skills**: any entry with `last_status: "dispatched"` where `last_dispatch` is **>45 minutes ago**. The skill was dispatched but never reported back — likely hung or crashed before the state update step ran.
- **API degradation**: any skill with `consecutive_failures >= 3`. This likely indicates an external API is down or rate-limiting. Report the skill, failure count, and `last_error`. If multiple skills share similar error signatures, flag the shared dependency.
- **Chronic failures**: any skill with `success_rate < 0.5` (and `total_runs >= 5`). The skill is failing more than it succeeds.
- **Self-check**: if heartbeat's own entry shows `last_success` is **>36 hours ago** (or missing), note that heartbeat itself may be unreliable.

### P1 — Stalled PRs & urgent issues

- [ ] Any open PRs stalled > 24h? (use `gh pr list`)
- [ ] Any GitHub issues labeled urgent? (use `gh issue list`)

### P2 — Flagged memory items

- [ ] Anything flagged in memory/MEMORY.md that needs follow-up?

### P3 — Missing scheduled skills

Read `aeon.yml` for enabled skills with schedules. Cross-reference with `memory/cron-state.json`:
- If an enabled skill has **no entry at all** in the state file, it has never been dispatched by the scheduler.
- If a skill's `last_success` is **>2x its schedule interval** old (e.g., a daily skill hasn't succeeded in >48h), flag it.

Do NOT use `gh run list` for this — the state file is authoritative.

## Dedup & notification

Before sending any notification, grep memory/logs/ for the same item. If it appears in the last 48h of logs, skip it. Never notify about the same item twice.

Batch all findings into a **single notification**, grouped by priority tier:
```
🔴 FAILED: skill-a (failed 2h ago), skill-b (stuck 1h ago)
🟡 STALLED: PR #42 open 3 days
🔵 MEMORY: follow-up on X flagged 2 days ago
```

## Public status page

After the priority checks (even when everything is green — this step **always** runs), regenerate `docs/status.md` so the public GitHub Pages site reflects current fleet health.

### Data sources
- `memory/cron-state.json` — per-skill run state (authoritative)
- `memory/issues/INDEX.md` — open issue table
- `aeon.yml` — enabled skill list with schedules
- Latest `articles/token-report-*.md` (most recent by filename date) — optional; powers the Token Pulse section. Skipped silently when no file exists.

### Overall status
Compute one of three overall states from the same signals used above:
- `🔴 DEGRADED` — any P0 flag fired (failed skill, stuck skill, consecutive_failures ≥ 3, chronic failures with success_rate < 0.5, heartbeat self-check >36h stale)
- `🟡 WATCH` — any P1/P2/P3 flag fired (stalled PRs, urgent issues, flagged memory items, skills >2x their schedule interval old) or any open issue with severity `critical` or `high`
- `🟢 OK` — no flags at all

### Format

Write `docs/status.md` with Jekyll frontmatter so it renders as a gallery page:

```markdown
---
layout: default
title: "Status"
permalink: /status/
---

# Agent Status

**Overall:** 🟢 OK
**Updated:** 2026-04-24 19:06 UTC
**Open issues:** 0
**Next scheduled run:** heartbeat at 20:00 UTC

Auto-generated by the `heartbeat` skill on every run (3× daily at 08:00 / 14:00 / 20:00 UTC). If the Updated timestamp is more than ~8h stale, the agent is not running.

## Token pulse

| Token | Price | 24h | Liquidity | Volume (24h) | FDV |
|-------|-------|-----|-----------|--------------|-----|
| AEON | $0.0000032626 | -11.16% | $223.4K | $41.3K | $326.3K |

_Source: `articles/token-report-2026-04-28.md` · verdict: SLIDING_

## Skill health (last 7 days)

| Skill | Last run | Status | Success rate | Consecutive failures |
|-------|----------|--------|-------------:|---------------------:|
| token-report | 2026-04-24 12:30 UTC | ✅ success | 100% | 0 |
| fetch-tweets | 2026-04-24 06:53 UTC | ✅ success | 95% | 0 |
| …           | …                    | …         | …    | … |

## Open issues

_(if INDEX.md has any open rows, render them here; otherwise: "No open issues.")_

| ID | Title | Severity | Category | Detected |
|----|-------|----------|----------|----------|
| ISS-001 | … | medium | rate-limit | 2026-04-22 |

---
*Fork this repo and your copy inherits this page automatically — [how it works](/memory/).*
```

### Rules
- Include **all** enabled skills from `aeon.yml` (not only those with recent runs). For skills with no entry in cron-state.json, show `—` for timestamp and `not yet run` in status.
- Sort the skill table by last-run timestamp descending (most recent first); skills that have never run sink to the bottom.
- Format timestamps as `YYYY-MM-DD HH:MM UTC` (strip seconds and the `Z`).
- Success rate shows `total_successes / total_runs × 100` rounded to whole percent; display `—` when `total_runs == 0`.
- Status column icons: `✅ success`, `❌ failed`, `⏳ dispatched` (if last_dispatch within 45min), `🕸 stuck` (if last_dispatch > 45min and last_status still dispatched), `—` (never run).
- For the `Next scheduled run:` line, pick the enabled skill with the soonest upcoming cron time relative to n