task-orchestrator
Autonomous multi-agent task orchestration with dependency analysis, parallel tmux/Codex execution, and self-healing heartbeat monitoring. Use for large projects with multiple issues/tasks that need coordinated parallel execution.
git clone --depth 1 https://github.com/jdrhyne/agent-skills /tmp/task-orchestrator && cp -r /tmp/task-orchestrator/skills/task-orchestrator ~/.claude/skills/task-orchestratorSKILL.md
# Task Orchestrator
Autonomous orchestration of multi-agent builds using tmux + Codex with self-healing monitoring.
**Load the senior-engineering skill alongside this one for engineering principles.**
## Safety Boundaries
- Do not launch parallel workers for tasks with overlapping write scope until the dependency is resolved.
- Do not push branches, merge work, or self-heal by guessing when human review is required.
- Do not store secrets in manifests, logs, prompts, or tmux pane captures.
- Do not continue retrying a failing task indefinitely; stop and surface the blocker after bounded retries.
## Core Concepts
### 1. Task Manifest
A JSON file defining all tasks, their dependencies, files touched, and status.
```json
{
"project": "project-name",
"repo": "owner/repo",
"workdir": "/path/to/worktrees",
"created": "2026-01-17T00:00:00Z",
"model": "gpt-5.2-codex",
"modelTier": "high",
"phases": [
{
"name": "Phase 1: Critical",
"tasks": [
{
"id": "t1",
"issue": 1,
"title": "Fix X",
"files": ["src/foo.js"],
"dependsOn": [],
"status": "pending",
"worktree": null,
"tmuxSession": null,
"startedAt": null,
"lastProgress": null,
"completedAt": null,
"prNumber": null
}
]
}
]
}
```
### 2. Dependency Rules
- **Same file = sequential** — Tasks touching the same file must run in order or merge
- **Different files = parallel** — Independent tasks can run simultaneously
- **Explicit depends = wait** — `dependsOn` array enforces ordering
- **Phase gates** — Next phase waits for current phase completion
### 3. Execution Model
- Each task gets its own **git worktree** (isolated branch)
- Each task runs in its own **tmux session**
- Use **Codex with --yolo** for autonomous execution
- Model: **GPT-5.2-codex high** (configurable)
---
## Setup Commands
### Initialize Orchestration
```bash
# 1. Create working directory
WORKDIR="${TMPDIR:-/tmp}/orchestrator-$(date +%s)"
mkdir -p "$WORKDIR"
# 2. Clone repo for worktrees
git clone https://github.com/OWNER/REPO.git "$WORKDIR/repo"
cd "$WORKDIR/repo"
# 3. Create tmux socket
SOCKET="$WORKDIR/orchestrator.sock"
# 4. Initialize manifest
cat > "$WORKDIR/manifest.json" << 'EOF'
{
"project": "PROJECT_NAME",
"repo": "OWNER/REPO",
"workdir": "WORKDIR_PATH",
"socket": "SOCKET_PATH",
"created": "TIMESTAMP",
"model": "gpt-5.2-codex",
"modelTier": "high",
"phases": []
}
EOF
```
### Analyze GitHub Issues for Dependencies
```bash
# Fetch all open issues
gh issue list --repo OWNER/REPO --state open --json number,title,body,labels > issues.json
# Group by files mentioned in issue body
# Tasks touching same files should serialize
```
### Create Worktrees
```bash
# For each task, create isolated worktree
cd "$WORKDIR/repo"
git worktree add -b fix/issue-N "$WORKDIR/task-tN" main
```
### Launch Tmux Sessions
```bash
SOCKET="$WORKDIR/orchestrator.sock"
# Create session for task
tmux -S "$SOCKET" new-session -d -s "task-tN"
# Launch Codex (uses gpt-5.2-codex with reasoning_effort=high from ~/.codex/config.toml)
# Note: Model config is in ~/.codex/config.toml, not CLI flag
tmux -S "$SOCKET" send-keys -t "task-tN" \
"cd $WORKDIR/task-tN && codex --yolo 'Fix issue #N: DESCRIPTION. Run tests, commit with good message, push to origin.'" Enter
```
---
## Monitoring & Self-Healing
### Progress Check Script
```bash
#!/bin/bash
# check_progress.sh - Run via heartbeat
WORKDIR="$1"
SOCKET="$WORKDIR/orchestrator.sock"
MANIFEST="$WORKDIR/manifest.json"
STALL_THRESHOLD_MINS=20
check_session() {
local session="$1"
local task_id="$2"
# Capture recent output
local output=$(tmux -S "$SOCKET" capture-pane -p -t "$session" -S -50 2>/dev/null)
# Check for completion indicators
if echo "$output" | grep -qE "(All tests passed|Successfully pushed|❯ $)"; then
echo "DONE:$task_id"
return 0
fi
# Check for errors
if echo "$output" | grep -qiE "(error:|failed:|FATAL|panic)"; then
echo "ERROR:$task_id"
return 1
fi
# Check for stall (prompt waiting for input)
if echo "$output" | grep -qE "(\? |Continue\?|y/n|Press any key)"; then
echo "STUCK:$task_id:waiting_for_input"
return 2
fi
echo "RUNNING:$task_id"
return 0
}
# Check all active sessions
for session in $(tmux -S "$SOCKET" list-sessions -F "#{session_name}" 2>/dev/null); do
check_session "$session" "$session"
done
```
### Self-Healing Actions
When a task is stuck, the orchestrator should:
1. **Waiting for input** → Send appropriate response
```bash
tmux -S "$SOCKET" send-keys -t "$session" "y" Enter
```
2. **Error/failure** → Capture logs, analyze, retry with fixes
```bash
# Capture error context
tmux -S "$SOCKET" capture-pane -p -t "$session" -S -100 > "$WORKDIR/logs/$task_id-error.log"
# Kill and restart with error context
tmux -S "$SOCKET" kill-session -t "$session"
tmux -S "$SOCKET" new-session -d -s "$session"
tmux -S "$SOCKET" send-keys -t "$session" \
"cd $WORKDIR/$task_id && codex --model gpt-5.2-codex-high --yolo 'Previous attempt failed with: $(cat error.log | tail -20). Fix the issue and retry.'" Enter
```
3. **No progress for 20+ mins** → Nudge or restart
```bash
# Check git log for recent commits
cd "$WORKDIR/$task_id"
LAST_COMMIT=$(git log -1 --format="%ar" 2>/dev/null)
# If no commits in threshold, restart
```
### Heartbeat Cron Setup
```bash
# Add to cron (every 15 minutes)
cron action:add job:{
"label": "orchestrator-heartbeat",
"schedule": "*/15 * * * *",
"prompt": "Check orchestration progress at WORKDIR. Read manifest, check all tmux sessions, self-heal any stuck tasks, advance to next phase if current is complete. Do NOT ping human - fix issues yourself."
}
```
---
## Workflow: Full Orchestration Run
### Step 1: Analyze & Plan
```bash
# 1. Fetch issues
gh issue list --repAutomatically update OpenClaw and selected skills once daily. Runs via cron, checks for updates, applies them, and messages the user with a summary of what changed.
Check for new OpenClaw releases and notify once per new version.
OpenClaw documentation expert with decision tree navigation, search scripts, doc fetching, version tracking, and config snippets for all OpenClaw features
Bulk download images from login-protected gallery websites using an attached browser session. Use when asked to scrape, download, or save images from authenticated gallery pages, extract full-size images from thumbnails, or batch download from multi-page galleries.
Three-Layer Memory System — automatic fact extraction, entity-based knowledge graph, and weekly synthesis. Manages life/areas/ entities with atomic facts and living summaries.
Captures learnings, errors, and corrections to enable continuous improvement. Use when: (1) A command or operation fails unexpectedly, (2) User corrects Claude ('No, that's wrong...', 'Actually...'), (3) User requests a capability that doesn't exist, (4) An external API or tool fails, (5) Claude realizes its knowledge is outdated or incorrect, (6) A better approach is discovered for a recurring task. Also review learnings before major tasks.
Sync skills between local installation and the GitHub source-of-truth repository. Use when asked to install, update, list, or push skills.
Persistent TODO scratch pad for tracking tasks across sessions. Use when user says "add to TODO", "what's on the TODO", "mark X done", "show TODO list", "remove from TODO", or asks about pending tasks. Also triggers on heartbeat to remind about stale items.