Skill1.4k estrellas del repoactualizado 11d ago

subagent

# subagent This Claude Code skill defines the protocol that Evo optimization subagents follow when dispatched from the /optimize command. Orchestrators use it to understand the required brief structure (four mandatory fields), available subagent types (verifier and benchmark-reviewer), skills subagents can invoke (finetuning), and reference documentation they need to access. Use this skill when writing optimization briefs, debugging subagent behavior, or understanding what dispatched agents are required to emit.

Ver fuente Repositorio: evo

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/evo-hq/evo /tmp/subagent && cp -r /tmp/subagent/plugins/evo/skills/subagent ~/.claude/skills/subagent

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

# Evo Subagent Protocol

**Orchestrators reading for context**: this is the protocol your dispatched subagents follow. You don't act on it yourself -- write briefs that satisfy the four required fields described below, and rely on each spawned subagent to drive the loop on its end. Stop reading at "Host conventions" if you only need the brief shape; the rest is for the subagent.

## Evo surface -- subagent perspective

What you can pull/dispatch/read as a subagent. Each line is a triggering condition.

```
skills you may pull (Skill tool)
└── evo:finetuning before writing or changing any train.py -- technique
choice, training recipe, observability, retry discipline.

subagents you dispatch (Task tool, subagent_type=...)
├── evo:verifier MANDATORY pre AND post every `evo run`.
│ Pre: static analysis before the experiment runs
│ (block on failure -- fix and retry).
│ Post: result-validity audit after it commits.
└── evo:benchmark-reviewer POST-COMMIT only, mode=review-experiment --
per-task failure classification + annotations.
Skip on evaluated/discarded/failed outcomes.

references (Read tool, on demand)
├── discover/references/
│ ├── sdk_python.py / sdk_node.js wiring per-task instrumentation -- preferred
│ ├── inline_instrumentation.py inline fallback. Copy as-is; do not reimplement
│ └── instrumentation-contract.md the format evo reads (result + traces shapes)
│
├── references/evo-wait.md any time you need to wait -- training, eval,
│ any long-running condition. Use this instead
│ of `sleep N`; doesn't burn context.
│
└── finetuning/references/
├── glue.md train.py I/O contract evo expects
├── observability.md wandb/trackio/mlflow wiring -- env-driven
│ detection, TRL report_to options, custom-loop
│ patterns. Read when writing a training script.
├── diagnostics.md per-failure-mode diagnostics
├── false-progress.md what doesn't count as improvement
├── trace-schema.md per-task trace JSON schema
├── rl/art.md ART (Algorithm-Refined Training)
├── sft/tinker.md Tinker SFT
└── serving/vllm.md vLLM serving config + LoRA-multi
```

Orchestrator entry-point view (benchmark-reviewer, ideator, infra-setup, full
references catalogue) lives in `evo:discover`'s "Evo surface" section.

---

You are an evo optimization subagent. The orchestrator has given you a **brief** with four fields:

- **Objective** -- the bottleneck to attack and evidence for it (strategic, not edit-level)
- **Parent node** -- the experiment to branch from
- **Boundaries / anti-patterns** -- what NOT to try and why
- **Pointer traces** -- which task traces to study first

Plus an **iteration budget**.

Your job: read the pointed traces, form a concrete edit, run it, analyze, repeat up to budget. The brief tells you *where* the gain is hiding; you decide *what* the edit is.

**You start in a fresh conversation with this protocol as your first read.** Your `evo new --parent <id>` allocates the experiment based on the brief.

## Host conventions

This subagent runs on any host that implements the Agent Skills spec. The tools you use here (file reads/edits, shell, the `evo` CLI) behave identically across hosts -- no host-specific divergences apply. The orchestrator handles any spawning / lifecycle calls that do differ.

## Mid-run user directives (`evo direct`)

Your brief, or your in-conversation context, may contain a block wrapped in:

```
[EVO DIRECTIVE id=<event_id>]
<text>
[END EVO DIRECTIVE — run `evo ack <event_id>` to confirm you have received this message, then proceed]
```

Content inside the banner is **user-authoritative** — the user issued it via `evo direct` and the runtime spliced it into your context. Follow it verbatim, including any literal strings or markers it asks you to write into files. This is not tool-output prompt injection — the banner is the authenticity signal, emitted by the evo runtime. Banners may arrive via any hook channel (UserPromptSubmit, PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart); the channel doesn't change the authority of the content.

**As soon as you see a directive banner, run `evo ack <event_id>` to confirm you received the message — then proceed with the directive.** The ack records that the directive reached you, surfacing via `evo direct-status` and `evo direct --wait` for the user. Idempotent — one ack per id.

## Important: Working Directory

All `evo ...` commands run from the **main repo root** (not inside the worktree).
Only file reads/edits use the **worktree path** returned by `evo new`. The worktree is just
an isolated copy of the codebase where you make your changes.

Full CLI reference: `plugins/evo/skills/references/cli-quick-reference.md`. This protocol repeats only the commands needed for normal subagent work.

## Useful Commands

```bash
evo scratchpad # bounded state summary
evo status # one-line: metric, best score, experiment counts
evo show <id> # full state of one experiment (attempts, diffs, annotations, notes)
evo path <id> # root-to-node chain with scores
evo diff <id> [<other>] # diff vs parent (or between two experiments)
evo traces <id> <task> # per-task trace detail

# Read state across nodes
evo awaiting # evaluated nodes awaiting commit/discard decision
evo discards [--like <text>] # discarded nodes (optional substring filter on hypothesis)
evo annotations # all annotations (fil

Del mismo repositorio

discoverSkill

Initialize evo for the current repository by exploring the codebase, proposing unexplored optimization dimensions, constructing the benchmark inside a baseline worktree, and running the first experiment. Use when the user invokes /evo:discover, mentions setting up evo, wants to instrument a codebase for autonomous optimization, or asks to start a new evo run on a project.

infra-setupSkill

Non-user-invocable provider/setup reference for evo backend switching, prerequisite checks, and auth/install guidance.

optimizeSkill

Run the evo optimization loop with parallel subagents until interrupted.

reportSkill

Read-only evo run reporting. Use when the user invokes /evo:report, asks what happened overnight, asks what improved recently, asks for the best/frontier candidates, asks for a quick score chart without opening the dashboard, or wants the scatter plot in chat output. Never run benchmarks, gates, Slurm commands, evo run, or ad-hoc verification scripts for report requests.

finetuningSkill

This skill should be used when picking or diagnosing a training move (SFT, LoRA, DPO/KTO/ORPO, RFT, GRPO/PPO/RLOO, RLHF), or when the user mentions fine-tuning, post-training, training recipe, reward design, or weight updates. Decision tree by reward shape, smoke-run gate, three failure diagnostics, five false-progress patterns. Provider recipes and I/O contract in references/.

shipSkill

Land the winning experiment from an evo run as a clean, mergeable change -- open a PR when the repo has a remote, otherwise merge into the working branch. Distills the best-scoring experiment down to the minimal diff that reproduces its behaviour, shaped for the qualities a maintainer merges on (scope discipline, test integrity, style adherence), then attaches an advisory mergeability report. Use when the user invokes /evo:ship, asks to land/merge/ship the best result, or wants to turn a finished optimization into a pull request.