Skill854 estrellas del repoactualizado 2d ago

coral-extend

coral-extend is a Claude Code skill for adding new components to the CORAL framework itself, such as agent runtimes under `coral/agent/builtin/`, CLI commands in `coral/cli/`, bundled skill templates, hooks, configuration fields, or grader infrastructure extensions. Use this skill when modifying the CORAL package architecture rather than debugging existing code or creating new example tasks.

Ver fuente Repositorio: CORAL

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/Human-Agent-Society/CORAL /tmp/coral-extend && cp -r /tmp/coral-extend/.claude/skills/coral-extend ~/.claude/skills/coral-extend

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

# Extending the CORAL framework

For day-to-day debug / reproduce loops see the sibling `coral-debug` skill. For creating a new `examples/<task>/` (seed + task.yaml + grader package) see `coral-new-task`. This skill covers *adding new components to the CORAL package itself*.

## Extending the grader infrastructure

If you're writing a grader for a specific task, use `coral-new-task`. This section is only for changes to the grader **framework** under `coral/grader/`:

- New helpers on `TaskGrader` (`coral/grader/task_grader.py`) — make sure they're useful to multiple existing example graders before adding.
- New `GraderInterface` implementations (`coral/grader/protocol.py` / `base.py`) — the bar is high; the existing protocol covers everything we currently need.
- Daemon-side changes (`coral/grader/daemon.py`) — concurrency, queue caps, worktree isolation, retry policy. Cover with `tests/test_grader_daemon.py`.
- Built-in graders under `coral/grader/builtin/` — `function_grader.py` is the only one today; not wired through `task.yaml`. New built-ins should justify why a `TaskGrader` subclass per task isn't enough.

## A new agent runtime

Adding a new runtime (e.g. another coding-agent CLI) means three small files plus a registry entry.

1. Create `coral/agent/builtin/<name>.py` and subclass `AgentRuntime` (`coral/agent/runtime.py`). Existing runtimes are the canonical reference — `claude_code.py` is the most complete; `codex.py` and `cursor_agent.py` are smaller and easier to mimic.
2. Register the runtime in `coral/agent/registry.py`:
   ```python
   _RUNTIMES["my_runtime"] = MyRuntime
   _ALIASES["mine"] = "my_runtime"
   _DEFAULT_MODELS["my_runtime"] = "default-model-id"
   ```
3. Decide the runtime's native shared-state directory name (`.claude` for Claude Code, `.codex` for Codex, etc.). The worktree symlink uses this; pass it through `shared_dir` so `generate_coral_md(...)` renders the right paths.
4. If the runtime needs special config plumbing (e.g. `cursor_agent.json`, `opencode.json`, gateway port), follow the `opencode` pattern: emit a per-agent config file inside the worktree at startup.
5. Add a smoke test in `tests/test_<runtime>.py` modeled on `tests/test_cursor_agent.py`.

Reference recent additions: PR #79 (cursor_agent), commit `f6f266e` (codex web_search config fix).

## A new CLI command

CLI is an old-school argparse single-file dispatcher.

1. Add a parser block in `coral/cli/__init__.py::main()`. Match the existing style — `_HelpOnErrorParser`, an epilog with `Examples:`, `_CommandHelpFormatter`. Add the new command name to `_VISIBLE_COMMANDS` so "did you mean?" suggestions work.
2. Implement `cmd_<name>(args: argparse.Namespace) -> None` in the most-fitting module under `coral/cli/`:
   - `start.py` — agent lifecycle (start/resume/stop/status)
   - `query.py` — read-only inspection (log/show/notes/skills/runs)
   - `eval.py` — agent-side commands that mutate the worktree (eval/wait/diff/revert/checkout)
   - `heartbeat.py` — heartbeat configuration
   - `ui.py` — dashboard
   - `author.py` — `init` / `validate`
   Create a new module if none of those fit; keep imports lazy so `coral --help` stays fast.
3. Wire the function into the `commands = {...}` dict at the bottom of `main()`.
4. If your command operates on a specific run, accept `--task` / `--run` via `_add_run_args(parser)` and resolve with `coral.cli._helpers.find_coral_dir`.
5. Add an example to `CLAUDE.md`'s Commands section.

## A new bundled skill or subagent template

These ship inside the package and are seeded into every run's `.coral/public/skills/` (or `agents/`) by `coral/workspace/project.py`.

- **Skill** — create `coral/template/skills/<name>/SKILL.md` with frontmatter `name` and `description`. Include `scripts/` and `references/` subdirs as needed; existing examples are `deep-research`, `organize-files`, `skill-creator`.
- **Subagent** — create `coral/template/agents/<name>.md` (single markdown file). Existing examples are `deep-researcher` and `librarian`.
- Add a test in `tests/test_template.py` if the rendering pulls in new template variables.

The seed copy is one-shot per run (`if not dst.exists()`), so iterating on template content during development means deleting `<run_dir>/.coral/public/skills/<name>/` and re-running `coral start`, or just editing the destination directly for that run.

## A new hook

Right now there's only `coral/hooks/post_commit.py`. If you add another hook:
- Define a clear single entrypoint function (model on `submit_eval`).
- Make it pure-function over `coral_dir` + agent_id where possible.
- Atomic writes to `.coral/public/` only; never write to a worktree from a hook.
- Add coverage to `tests/test_hooks.py`.

## Configuration changes

`coral/config.py` is dataclass-based and merged via OmegaConf. When adding a new field:

1. Add it to the right dataclass (`AgentConfig`, `GraderConfig`, ...) with a sensible default.
2. If it deserves runtime validation, add it to the `__post_init__` of that dataclass.
3. Cover the new field in `tests/test_config.py`.
4. Update `examples/<task>/task.yaml` only if the field is task-author facing — internal knobs should stay defaulted.
5. Mention it in `CLAUDE.md` if it changes user-visible behavior; otherwise leave the docs alone (CLAUDE.md describes invariants, not every flag).

## Don't forget

- **Lint + test before pushing**: `uv run ruff check . && uv run ruff format . && uv run pytest tests/ -v`.
- **Backward compatibility for run dirs.** People resume old runs. Anything that reads from `.coral/public/` must tolerate missing files (return defaults), not crash.
- **No agent-side `git`.** All commits go through `coral eval` → `submit_eval`. Don't add helpers that shell out to git from agent context.

Del mismo repositorio

coral-debugSkill

Verify and debug changes to CORAL itself — smallest reproduce loop per area (grader / daemon / CLI / hooks / manager / workspace / hub / template / config / web), where to look when something breaks (hung graders, agent restart loops, stalled agents, missing heartbeat actions, corrupted shared state, broken worktree symlinks, grader import errors, wrong-task resume), how to inspect a live or finished run under `.coral/public/`, and the canonical lint/test commands. Use when editing code under `coral/` or chasing a CORAL bug, NOT when adding a new task or extending the framework.

coral-new-taskSkill

End-to-end recipe for adding a new task under `examples/` — the three pieces that have to line up (`task.yaml`, `seed/`, and `grader/`), what to put in each, the `TaskGrader` API surface, the `coral validate` → smoke-test loop, and the common mistakes (repo_path pointing at the wrong dir, score direction backwards, hidden answer keys leaking into seed/, grader writing to codebase_path which the daemon force-removes, private-vs-public confusion, missing `run()` signature). Use whenever the user wants to add a new CORAL task or port an existing benchmark into CORAL.

deep-researchSkill

Research the problem domain before coding. Web search for techniques, save raw sources, write structured findings, update the index.

organize-filesSkill

Organize the shared notes directory when it becomes hard to navigate. Restructure within research/ and experiments/, deduplicate, update index.md.

skill-creatorSkill

Autonomously create, test, and optimize skills by detecting reusable patterns in your own work. Use when you notice repeated tool sequences, recurring code patterns across attempts, or insights that should be captured as a packaged skill. Also use to benchmark and iterate on existing skills.

coral-quickstartSkill

The fast path from zero to a running CORAL experiment — what CORAL is and when to reach for it, installing the `coral` CLI, registering a runtime with `coral setup`, and the `.coral_workspace/` convention for pointing CORAL at code you already have and want optimized. Use this whenever the user asks "what is coral", "should I use coral for this", wants to install or get coral set up, hits a "command not found" for coral or doesn't have it installed yet, or says "use coral to optimize / speed up / improve this code" and you need the end-to-end onboarding from install to a launched run. Hands off to `setting-up-coral` (runtime bindings), `creating-a-coral-task` (grader authoring), and `running-coral-experiments` (operating a run) for depth.

creating-a-coral-taskSkill

Author a new CORAL task — the three pieces that must line up (`task.yaml`, `seed/`, a packaged `grader/`), the `coral init` → `coral validate` → smoke-test loop, and how to pick a grader pattern (stdout float, test pass-rate, ratio-vs-baseline, multi-metric, or an LLM rubric judge). Use whenever the user wants to create a CORAL task, write or wire a grader, port a benchmark into CORAL, score open-ended outputs (reports/memos) with a judge, or debug a grader that crashes on the seed / ranks the leaderboard backwards / leaks the answer key. Deep references for the TaskGrader API, grader patterns, rubric judges, and the full task.yaml schema live alongside this skill.

running-coral-experimentsSkill

Run and manage CORAL experiments from the operator side — launch agents with `coral start` (dotlist overrides, model/count, tmux vs local), monitor with `coral status` / `coral log` / `coral show` / the web dashboard, and drive the loop with `coral resume` (inject instructions, fork from an attempt), `coral heartbeat` (tune reflection cadence), and `coral stop`. Use whenever the user wants to start a CORAL run, check on agents, read scores/leaderboard, steer or resume a run, diagnose agents that keep restarting or fail every eval, scale to more agents or islands, or stop a run. Deep references for steering/heartbeat tuning and scaling/troubleshooting live alongside this skill.