Fleet Control
Fleet Control monitors a fleet of Aeon instances registered in a JSON registry, performing health checks, status summaries, and skill dispatch operations. It verifies GitHub authentication and rate limits before each run, retrieves repository metadata and recent workflow runs for each instance, tracks state changes across runs, and outputs decision-ready verdicts with per-instance action recommendations. Use this skill to maintain visibility into managed Aeon deployments and remotely trigger skills on healthy or degraded instances.
git clone --depth 1 https://github.com/aaronjmars/aeon /tmp/fleet-control && cp -r /tmp/fleet-control/skills/fleet-control ~/.claude/skills/fleet-controlSKILL.md
<!-- autoresearch: variation B — sharper output: verdict line + delta vs prior + per-instance action column + state-change-gated notify -->
> **${var}** — Command. Empty (or unrecognized) → Health Check (default). `status` → full Status Mode. `dispatch <instance|*> <skill> [var=<value>]` → trigger a skill on one child or all healthy/degraded children.
Today is ${today}. Operate the fleet of Aeon instances registered in `memory/instances.json`. Output is **decision-ready**: every run leads with a verdict, then a delta vs prior check, then per-instance lines that name the next concrete action.
## Pre-flight (every mode)
1. **Verify gh auth** — `gh auth status` must succeed. If not, log `FLEET_NO_AUTH` to `memory/logs/${today}.md` and notify `Fleet Control: gh auth missing — check GITHUB_TOKEN secret.` Stop.
2. **Check rate limit** — `REMAINING=$(gh api rate_limit --jq '.resources.core.remaining')`. If `REMAINING < 50`, log `FLEET_RATE_LIMITED:remaining=${REMAINING}` and notify a one-line warning, then stop.
3. **Load the registry** — read `memory/instances.json`. If the file is missing, write `{"instances": []}` to bootstrap. If `.instances` is absent or `[]`:
- Log `FLEET_EMPTY: no managed instances` to `memory/logs/${today}.md`.
- **Stop. Do NOT notify.**
4. **Load prior state** — read `memory/state/fleet-control-state.json` (create the directory and file with `{"instances": {}, "last_full_summary_date": ""}` if missing). Shape:
```json
{
"instances": {
"<name>": { "health": "<status>", "last_checked": "<ISO>", "consecutive_unreachable": 0 }
},
"last_full_summary_date": "YYYY-MM-DD"
}
```
5. **Parse var → mode**:
- empty / unrecognized → **Health Check Mode**
- exactly `status` → **Status Mode**
- starts with `dispatch ` → **Dispatch Mode**
---
## Health Check Mode (default)
For each registered instance, skip rows with `archived: true` from per-instance work (count them separately). Run the three calls per instance in parallel using `&` + `wait` and write each to `/tmp/fleet/${SAFE}.{repo,runs,cron}.json`:
a. **Repo metadata**:
```bash
gh api "repos/${REPO}" \
--jq '{full_name, pushed_at, archived, default_branch, open_issues_count}' \
> "/tmp/fleet/${SAFE}.repo.json" 2>"/tmp/fleet/${SAFE}.repo.err" &
```
b. **Workflow runs in last 24h** (precise window, not "last 5"):
```bash
SINCE=$(date -u -d '24 hours ago' +%Y-%m-%dT%H:%M:%SZ)
gh api "repos/${REPO}/actions/runs?created=>${SINCE}&per_page=100&exclude_pull_requests=true" \
--jq '{total_count, runs:[.workflow_runs[]|{name,status,conclusion,created_at,html_url}]}' \
> "/tmp/fleet/${SAFE}.runs.json" 2>"/tmp/fleet/${SAFE}.runs.err" &
```
c. **Cron-state from child**:
```bash
gh api "repos/${REPO}/contents/memory/cron-state.json" --jq '.content' 2>"/tmp/fleet/${SAFE}.cron.err" \
| base64 -d > "/tmp/fleet/${SAFE}.cron.json" &
```
`wait` after launching all three for an instance (or batch across all instances if you trust your parallelism — keep ≤16 concurrent calls to stay under rate limit).
**Classify each instance** with precise thresholds:
- **unreachable** — repo metadata call returned non-zero (404/403/etc.)
- **archived** — repo metadata returns `archived: true`
- **pending_secrets** — `runs.total_count == 0` for the 24h window AND repo `pushed_at` ≥ 7 days old (newly-spawned instances under 7 days stay unclassified-but-tracked)
- **stale** — `runs.total_count == 0` AND `pushed_at` > 7 days old AND not `archived`
- **degraded** — ≥1 cron-state skill with `consecutive_failures ≥ 3` OR (24h failure_count / total_count) ≥ 0.5 with total_count ≥ 2
- **warning** — 24h failure_count ≥ 1 but ratio < 0.5
- **healthy** — has runs in last 24h, all conclusions `success` or `in_progress`/`queued`, no degraded cron-state skills
For each instance compute a **next_action** (one short imperative phrase):
- `pending_secrets` → `add ANTHROPIC_API_KEY at https://github.com/${REPO}/settings/secrets/actions`
- `degraded` → `investigate <skill_name> (<consecutive_failures>× in a row, last_error: <signature, ≤60 chars>)`
- `warning` → `monitor — <N>/<Total> runs failed in 24h`
- `stale` → `confirm intent: no runs in 24h, last push <relative_date>; archive or re-enable`
- `unreachable` → `verify access: <reason from repo.err>`
- `healthy` → `none`
- `archived` → `none (archived)`
**Compute delta** vs prior state (per-instance `prior.health` vs `current.health`):
- **NEW** — instance not in prior state
- **DEGRADED** — was healthy/warning, now degraded/unreachable/stale/pending_secrets
- **RECOVERED** — was degraded/unreachable/stale/pending_secrets, now healthy/warning
- **DROPPED** — was in prior state, no longer in registry
- (no change → no delta line)
**Update the registry** — write back `health`, `last_checked` (ISO UTC), and `next_action` per instance to `memory/instances.json`. Preserve all other fields (`purpose`, `parent`, `created`, `skills_enabled`, etc.).
**Update the state file** — write the current per-instance health snapshot to `memory/state/fleet-control-state.json`. Update `last_full_summary_date` to today **only when this run notifies**. Increment `consecutive_unreachable` for unreachable instances; reset to 0 otherwise.
**Log** to `memory/logs/${today}.md`:
```
## fleet-control (health check)
- Verdict: [FLEET_OK | NEEDS_ATTENTION:N]
- Sizes: total=N, healthy=N, warning=N, degraded=N, stale=N, pending=N, unreachable=N, archived=N
- Deltas: [list NEW/DEGRADED/RECOVERED/DROPPED, or "none"]
- Sources: gh=ok, rate_remaining=N
```
**Notification gate** — send the notification if **any** of:
- `len(deltas) > 0`
- today != prior `last_full_summary_date` (first check of UTC day → daily rollup)
- any current instance is `degraded` or `unreachable`
Otherwise skip notify (silent no-op when nothing changed mid-day — operator isn't trained to ignore).
**Notification body** (when sent):
```
*Fleet Control — ${today}*
Verdict: <Mention/keyword sweep on social platforms for [REPLACE: KEYWORDS] — trends, sentiment, top posts
5 concrete real-life actions, leverage-scored against open loops with specificity and anti-fluff gates
Curated AI-agent tweets, clustered into narratives with insight summaries
Tracker of AI agent substitution signals — which roles, companies, and industries show real headcount displacement. Named roles + real deployments only.
Competitive-intelligence digest on the AI agent framework space — momentum, releases, breaking changes across a curated watchlist
Cross-domain market pulse from AIXBT's free grounding endpoint — crypto, macro, tradfi, geopolitics. Refreshes taxonomy references (clusters, chains) as a bonus.
Pre-batch API provider health check — detects credit exhaustion or auth failure for every configured provider key before the scheduled batch runs, giving the operator a window to act before skills degrade
List a wallet's live ERC-20 token approvals on Base and flag unlimited / risky spender grants. Keyless via Base RPC (eth_getLogs + eth_call) — no explorer key needed.