Skip to main content
ClaudeWave
Skill235 repo starsupdated 3d ago

task-orchestrator

Autonomous multi-agent task orchestration with dependency analysis, parallel tmux/Codex execution, and self-healing heartbeat monitoring. Use for large projects with multiple issues/tasks that need coordinated parallel execution.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/jdrhyne/agent-skills /tmp/task-orchestrator && cp -r /tmp/task-orchestrator/skills/task-orchestrator ~/.claude/skills/task-orchestrator
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Task Orchestrator

Autonomous orchestration of multi-agent builds using tmux + Codex with self-healing monitoring.

**Load the senior-engineering skill alongside this one for engineering principles.**

## Safety Boundaries

- Do not launch parallel workers for tasks with overlapping write scope until the dependency is resolved.
- Do not push branches, merge work, or self-heal by guessing when human review is required.
- Do not store secrets in manifests, logs, prompts, or tmux pane captures.
- Do not continue retrying a failing task indefinitely; stop and surface the blocker after bounded retries.

## Core Concepts

### 1. Task Manifest
A JSON file defining all tasks, their dependencies, files touched, and status.

```json
{
  "project": "project-name",
  "repo": "owner/repo",
  "workdir": "/path/to/worktrees",
  "created": "2026-01-17T00:00:00Z",
  "model": "gpt-5.2-codex",
  "modelTier": "high",
  "phases": [
    {
      "name": "Phase 1: Critical",
      "tasks": [
        {
          "id": "t1",
          "issue": 1,
          "title": "Fix X",
          "files": ["src/foo.js"],
          "dependsOn": [],
          "status": "pending",
          "worktree": null,
          "tmuxSession": null,
          "startedAt": null,
          "lastProgress": null,
          "completedAt": null,
          "prNumber": null
        }
      ]
    }
  ]
}
```

### 2. Dependency Rules
- **Same file = sequential** — Tasks touching the same file must run in order or merge
- **Different files = parallel** — Independent tasks can run simultaneously
- **Explicit depends = wait** — `dependsOn` array enforces ordering
- **Phase gates** — Next phase waits for current phase completion

### 3. Execution Model
- Each task gets its own **git worktree** (isolated branch)
- Each task runs in its own **tmux session**
- Use **Codex with --yolo** for autonomous execution
- Model: **GPT-5.2-codex high** (configurable)

---

## Setup Commands

### Initialize Orchestration

```bash
# 1. Create working directory
WORKDIR="${TMPDIR:-/tmp}/orchestrator-$(date +%s)"
mkdir -p "$WORKDIR"

# 2. Clone repo for worktrees
git clone https://github.com/OWNER/REPO.git "$WORKDIR/repo"
cd "$WORKDIR/repo"

# 3. Create tmux socket
SOCKET="$WORKDIR/orchestrator.sock"

# 4. Initialize manifest
cat > "$WORKDIR/manifest.json" << 'EOF'
{
  "project": "PROJECT_NAME",
  "repo": "OWNER/REPO",
  "workdir": "WORKDIR_PATH",
  "socket": "SOCKET_PATH",
  "created": "TIMESTAMP",
  "model": "gpt-5.2-codex",
  "modelTier": "high",
  "phases": []
}
EOF
```

### Analyze GitHub Issues for Dependencies

```bash
# Fetch all open issues
gh issue list --repo OWNER/REPO --state open --json number,title,body,labels > issues.json

# Group by files mentioned in issue body
# Tasks touching same files should serialize
```

### Create Worktrees

```bash
# For each task, create isolated worktree
cd "$WORKDIR/repo"
git worktree add -b fix/issue-N "$WORKDIR/task-tN" main
```

### Launch Tmux Sessions

```bash
SOCKET="$WORKDIR/orchestrator.sock"

# Create session for task
tmux -S "$SOCKET" new-session -d -s "task-tN"

# Launch Codex (uses gpt-5.2-codex with reasoning_effort=high from ~/.codex/config.toml)
# Note: Model config is in ~/.codex/config.toml, not CLI flag
tmux -S "$SOCKET" send-keys -t "task-tN" \
  "cd $WORKDIR/task-tN && codex --yolo 'Fix issue #N: DESCRIPTION. Run tests, commit with good message, push to origin.'" Enter
```

---

## Monitoring & Self-Healing

### Progress Check Script

```bash
#!/bin/bash
# check_progress.sh - Run via heartbeat

WORKDIR="$1"
SOCKET="$WORKDIR/orchestrator.sock"
MANIFEST="$WORKDIR/manifest.json"
STALL_THRESHOLD_MINS=20

check_session() {
  local session="$1"
  local task_id="$2"
  
  # Capture recent output
  local output=$(tmux -S "$SOCKET" capture-pane -p -t "$session" -S -50 2>/dev/null)
  
  # Check for completion indicators
  if echo "$output" | grep -qE "(All tests passed|Successfully pushed|❯ $)"; then
    echo "DONE:$task_id"
    return 0
  fi
  
  # Check for errors
  if echo "$output" | grep -qiE "(error:|failed:|FATAL|panic)"; then
    echo "ERROR:$task_id"
    return 1
  fi
  
  # Check for stall (prompt waiting for input)
  if echo "$output" | grep -qE "(\? |Continue\?|y/n|Press any key)"; then
    echo "STUCK:$task_id:waiting_for_input"
    return 2
  fi
  
  echo "RUNNING:$task_id"
  return 0
}

# Check all active sessions
for session in $(tmux -S "$SOCKET" list-sessions -F "#{session_name}" 2>/dev/null); do
  check_session "$session" "$session"
done
```

### Self-Healing Actions

When a task is stuck, the orchestrator should:

1. **Waiting for input** → Send appropriate response
   ```bash
   tmux -S "$SOCKET" send-keys -t "$session" "y" Enter
   ```

2. **Error/failure** → Capture logs, analyze, retry with fixes
   ```bash
   # Capture error context
   tmux -S "$SOCKET" capture-pane -p -t "$session" -S -100 > "$WORKDIR/logs/$task_id-error.log"
   
   # Kill and restart with error context
   tmux -S "$SOCKET" kill-session -t "$session"
   tmux -S "$SOCKET" new-session -d -s "$session"
   tmux -S "$SOCKET" send-keys -t "$session" \
     "cd $WORKDIR/$task_id && codex --model gpt-5.2-codex-high --yolo 'Previous attempt failed with: $(cat error.log | tail -20). Fix the issue and retry.'" Enter
   ```

3. **No progress for 20+ mins** → Nudge or restart
   ```bash
   # Check git log for recent commits
   cd "$WORKDIR/$task_id"
   LAST_COMMIT=$(git log -1 --format="%ar" 2>/dev/null)
   
   # If no commits in threshold, restart
   ```

### Heartbeat Cron Setup

```bash
# Add to cron (every 15 minutes)
cron action:add job:{
  "label": "orchestrator-heartbeat",
  "schedule": "*/15 * * * *",
  "prompt": "Check orchestration progress at WORKDIR. Read manifest, check all tmux sessions, self-heal any stuck tasks, advance to next phase if current is complete. Do NOT ping human - fix issues yourself."
}
```

---

## Workflow: Full Orchestration Run

### Step 1: Analyze & Plan

```bash
# 1. Fetch issues
gh issue list --rep
auto-updaterSkill

Automatically update OpenClaw and selected skills once daily. Runs via cron, checks for updates, applies them, and messages the user with a summary of what changed.

clawdbot-release-checkSkill

Check for new OpenClaw releases and notify once per new version.

clawddocsSkill

OpenClaw documentation expert with decision tree navigation, search scripts, doc fetching, version tracking, and config snippets for all OpenClaw features

gallery-scraperSkill

Bulk download images from login-protected gallery websites using an attached browser session. Use when asked to scrape, download, or save images from authenticated gallery pages, extract full-size images from thumbnails, or batch download from multi-page galleries.

knowledge-graphSkill

Three-Layer Memory System — automatic fact extraction, entity-based knowledge graph, and weekly synthesis. Manages life/areas/ entities with atomic facts and living summaries.

self-improving-agentSkill

Captures learnings, errors, and corrections to enable continuous improvement. Use when: (1) A command or operation fails unexpectedly, (2) User corrects Claude ('No, that's wrong...', 'Actually...'), (3) User requests a capability that doesn't exist, (4) An external API or tool fails, (5) Claude realizes its knowledge is outdated or incorrect, (6) A better approach is discovered for a recurring task. Also review learnings before major tasks.

skill-syncSkill

Sync skills between local installation and the GitHub source-of-truth repository. Use when asked to install, update, list, or push skills.

todo-trackerSkill

Persistent TODO scratch pad for tracking tasks across sessions. Use when user says "add to TODO", "what's on the TODO", "mark X done", "show TODO list", "remove from TODO", or asks about pending tasks. Also triggers on heartbeat to remind about stale items.