Skill851 estrellas del repoactualizado yesterday

coral-debug

The coral-debug skill provides reproduction loops and diagnostic guidance for debugging the CORAL package itself across its major components (grader, daemon, CLI, hooks, manager, workspace, hub, template, config, web). It maps code areas to their fastest test commands, lists common failure modes and where to inspect them in `.coral/public/`, and includes canonical lint and end-to-end smoke test procedures. Use this when modifying CORAL's internal codebase or investigating framework bugs, not when authoring tasks or extending with new capabilities.

Ver fuente Repositorio: CORAL

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/Human-Agent-Society/CORAL /tmp/coral-debug && cp -r /tmp/coral-debug/.claude/skills/coral-debug ~/.claude/skills/coral-debug

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

# CORAL debug & change-verification workflows

This skill is for people (and Claude Code) hacking on the CORAL package itself, not for users authoring a task. For authoring guides see the siblings:
- `coral-new-task` — creating a new `examples/<task>/` (seed + task.yaml + grader)
- `coral-extend` — extending CORAL itself (new runtime, new CLI command, new bundled skill, ...)

## Reproduce loops

Pick the smallest one that exercises your change.

| Editing... | Fastest reproduce |
|---|---|
| `coral/grader/{task_grader,loader,subprocess_grader}.py` or a builtin grader | `uv run coral validate examples/circle_packing` (or any example with a packaged grader) |
| `coral/grader/daemon.py` (parallel grading, queueing, worktree isolation) | `uv run pytest tests/test_grader_daemon.py -v` |
| `coral/grader/{base,protocol}.py` | `uv run pytest tests/test_grader.py tests/test_subprocess_grader.py -v` |
| `coral/cli/*.py` (argparse, dispatch, output formatting) | Run the command directly: `uv run coral log --help`, `uv run coral status`, etc. against a `latest` run under `examples/<task>/results/` |
| `coral/hooks/post_commit.py` (`submit_eval`) | `uv run pytest tests/test_hooks.py -v`, then end-to-end via `coral start -c task.yaml agents.count=1` and watch `.coral/public/attempts/` |
| `coral/agent/manager.py`, `state.py`, `exit_classifier.py` | `uv run pytest tests/test_manager_reliability.py tests/test_manager_seen_attempts.py -v` |
| `coral/agent/builtin/*.py` (a runtime) | `uv run pytest tests/test_<runtime>.py` if it exists, then `coral start -c task.yaml agents.runtime=<name> agents.count=1` |
| `coral/workspace/{project,worktree,repo,grader_env}.py` | `uv run pytest tests/test_workspace.py tests/test_grader_env.py -v` |
| `coral/hub/{attempts,notes,skills,checkpoint,heartbeat}.py` | `uv run pytest tests/test_hub.py tests/test_checkpoint.py tests/test_heartbeat.py -v` |
| `coral/template/coral_md.py` or templates | `uv run pytest tests/test_template.py -v` |
| `coral/config.py` | `uv run pytest tests/test_config.py -v` |
| `coral/web/*` (dashboard) | `uv run coral ui` against any existing run; reload the browser after edits |

End-to-end smoke (slow but exercises the whole pipeline):
```bash
uv run coral start -c examples/circle_packing/task.yaml agents.count=1 run.session=local
# Wait for one eval to land in .coral/public/attempts/, then:
uv run coral stop
```

## Where to look when X breaks

| Symptom | First place to look |
|---|---|
| Grader hangs / pending attempts pile up | `.coral/public/grader_daemon.pid` (is daemon alive?), `.coral/public/grader_daemon_heartbeat` (mtime), `.coral/public/eval_logs/<hash>/` for grader stdout/stderr |
| `coral eval` errors out | The pending JSON itself: `.coral/public/attempts/<hash>.json` — `status` and `feedback` fields. Source: `coral/hooks/post_commit.py::submit_eval` |
| Agent restart loop | `coral status` shows pause state. Source: `coral/agent/manager.py` (`restart_burst_threshold`, `restart_burst_window`). Per-agent runtime stdout/stderr is captured under the worktree. |
| Stalled agent | `agents.timeout` watchdog in `coral/agent/manager.py`. Grader-queue exemption (`grader_pending_max_age`) skips the watchdog while a recent submission is still pending. |
| Heartbeat actions not firing | `.coral/public/heartbeat/<agent_id>.json` (per-agent action list), `.coral/public/eval_count` (global counter). Logic: `coral/agent/heartbeat.py::HeartbeatRunner.check`. |
| Shared state corrupted / race | `.coral/public/.git/` is a real git repo — `git -C .coral/public log` shows checkpoint history; `coral notes --history` is the friendly view. |
| Worktree symlinks missing | `coral/workspace/worktree.py` — symlinks `.coral/public/` into each agent worktree under the runtime-specific name (`.claude` / `.codex` / `.opencode`). |
| Grader can't import its package | Check `.coral/private/grader_venv/` — `setup_grader_env` ran `uv venv` + `grader.setup` once at run start. Re-running `coral start` on the same `run_dir` does NOT re-bootstrap; delete the venv to force it. |
| Resume picks up wrong task | `coral resume` reads `.coral/config.yaml` and `.coral/config_dir`. The `latest` symlink at `results/<task-slug>/latest` decides which run is resumed. |

## Inspect a live or finished run

```bash
RUN=$(readlink -f results/<task-slug>/latest)
ls "$RUN/.coral/public/"            # attempts/, notes/, skills/, agents/, logs/, eval_logs/, eval_count
cat "$RUN/.coral/public/attempts/<hash>.json" | jq .
ls "$RUN/agents/"                    # one worktree per agent
cat "$RUN/.coral/config.yaml"        # exact config used (post-merge)
```

The web dashboard (`coral ui`) reads the same files; if you're debugging dashboard rendering, `coral/web/api.py` is the route handler and `coral/web/static/` is the built React bundle.

## Test + lint

```bash
uv sync --extra dev                  # one-time, gets pytest + ruff
uv run pytest tests/ -v              # full suite (~seconds, no docker)
uv run pytest tests/test_<thing>.py -v -k <pattern>
uv run ruff check .                  # lint
uv run ruff format .                 # autoformat
```

There is no separate type-check step in CI yet; tests cover the contract.

## Quick task scaffold for ad-hoc testing

When you need a throwaway task to exercise a code change:
```bash
uv run coral init /tmp/coral-scratch
# edit /tmp/coral-scratch/grader/src/coral_scratch_grader/grader.py to return a simple float
uv run coral validate /tmp/coral-scratch
uv run coral start -c /tmp/coral-scratch/task.yaml agents.count=1 run.session=local
```

`coral validate` runs the grader against `seed/` in a tempdir without spawning agents — the fastest way to confirm a grader change compiles and produces a `Score`.

## Conventions

- **No `git` from agents.** All commits go through `coral eval`. If you're adding a feature that needs to commit, route it through `coral/hooks/post_commit.py` or `coral/hub/checkpoint.py`.
- **Atomic writes** for any JSON in `.coral/pu

Del mismo repositorio

coral-extendSkill

Add a new component to the CORAL framework itself — a new agent runtime under `coral/agent/builtin/` (claude_code/codex/cursor_agent style), a new CLI command in `coral/cli/`, a new bundled skill or subagent template under `coral/template/skills/` or `coral/template/agents/`, a new hook in `coral/hooks/`, a new field in `coral/config.py`, or a framework-level extension to the grader stack under `coral/grader/`. NOT for writing a per-task grader or adding an example task — use `coral-new-task` for that. NOT for debugging existing code — use `coral-debug`.

coral-new-taskSkill

End-to-end recipe for adding a new task under `examples/` — the three pieces that have to line up (`task.yaml`, `seed/`, and `grader/`), what to put in each, the `TaskGrader` API surface, the `coral validate` → smoke-test loop, and the common mistakes (repo_path pointing at the wrong dir, score direction backwards, hidden answer keys leaking into seed/, grader writing to codebase_path which the daemon force-removes, private-vs-public confusion, missing `run()` signature). Use whenever the user wants to add a new CORAL task or port an existing benchmark into CORAL.

deep-researchSkill

Research the problem domain before coding. Web search for techniques, save raw sources, write structured findings, update the index.

organize-filesSkill

Organize the shared notes directory when it becomes hard to navigate. Restructure within research/ and experiments/, deduplicate, update index.md.

skill-creatorSkill

Autonomously create, test, and optimize skills by detecting reusable patterns in your own work. Use when you notice repeated tool sequences, recurring code patterns across attempts, or insights that should be captured as a packaged skill. Also use to benchmark and iterate on existing skills.

coral-quickstartSkill

The fast path from zero to a running CORAL experiment — what CORAL is and when to reach for it, installing the `coral` CLI, registering a runtime with `coral setup`, and the `.coral_workspace/` convention for pointing CORAL at code you already have and want optimized. Use this whenever the user asks "what is coral", "should I use coral for this", wants to install or get coral set up, hits a "command not found" for coral or doesn't have it installed yet, or says "use coral to optimize / speed up / improve this code" and you need the end-to-end onboarding from install to a launched run. Hands off to `setting-up-coral` (runtime bindings), `creating-a-coral-task` (grader authoring), and `running-coral-experiments` (operating a run) for depth.

creating-a-coral-taskSkill

Author a new CORAL task — the three pieces that must line up (`task.yaml`, `seed/`, a packaged `grader/`), the `coral init` → `coral validate` → smoke-test loop, and how to pick a grader pattern (stdout float, test pass-rate, ratio-vs-baseline, multi-metric, or an LLM rubric judge). Use whenever the user wants to create a CORAL task, write or wire a grader, port a benchmark into CORAL, score open-ended outputs (reports/memos) with a judge, or debug a grader that crashes on the seed / ranks the leaderboard backwards / leaks the answer key. Deep references for the TaskGrader API, grader patterns, rubric judges, and the full task.yaml schema live alongside this skill.

running-coral-experimentsSkill

Run and manage CORAL experiments from the operator side — launch agents with `coral start` (dotlist overrides, model/count, tmux vs local), monitor with `coral status` / `coral log` / `coral show` / the web dashboard, and drive the loop with `coral resume` (inject instructions, fork from an attempt), `coral heartbeat` (tune reflection cadence), and `coral stop`. Use whenever the user wants to start a CORAL run, check on agents, read scores/leaderboard, steer or resume a run, diagnose agents that keep restarting or fail every eval, scale to more agents or islands, or stop a run. Deep references for steering/heartbeat tuning and scaling/troubleshooting live alongside this skill.