Skill1.2k repo starsupdated today
autocontext
Autocontext is a control plane for measuring and analyzing Hermes agent behavior that runs scenarios, inspects curator state, exports training data, and prepares datasets for local MLX or CUDA training through the autoctx CLI. Use it when you need to evaluate agent performance, replay runs, audit skill curation decisions, generate machine-readable status reports, or prepare data for model training without directly modifying Hermes skills.
Install in Claude Code
Copygit clone --depth 1 https://github.com/greyhaven-ai/autocontext /tmp/autocontext && cp -r /tmp/autocontext/skills/autocontext ~/.claude/skills/autocontextThen start a new Claude Code session; the skill loads automatically.
Definition
SKILL.md
# Autocontext ## Overview Autocontext is a control plane for evaluating agent behavior, preserving useful run artifacts, exporting training data, and distilling stable behavior into local runtimes. In Hermes, use this skill when the work calls for measurement, replay, datasets, local MLX/CUDA training, or read-only analysis of Hermes skill curation. Hermes Curator owns Hermes skill mutation. Autocontext should inspect, evaluate, replay, export, and recommend. Do not use Autocontext as a replacement for Hermes Curator, and do not edit Hermes skills directly unless the user explicitly asks for that operation. ## When to Use - You need to run an Autocontext scenario from Hermes and inspect the result. - You need machine-readable status for runs, solved knowledge, or training jobs. - You need to inspect Hermes v0.12 Curator reports, skill usage counters, pinned state, or skill provenance. - You need to export Autocontext knowledge into a reusable package or skill-like artifact. - You need to prepare data for local MLX or CUDA training. - You need to decide whether MCP is useful in a configured environment. Do not use this skill for normal Hermes memory updates, direct skill consolidation, or user-local skill deletion. Those are Hermes Curator responsibilities. ## Integration Surface Order Use the CLI first. The `autoctx` CLI is the default surface because Hermes agents can run it with normal terminal tools, see stdout and stderr, preserve logs, and debug failures without special host configuration. MCP is optional. Use MCP when the environment already has Autocontext MCP configured and the task benefits from typed schemas, constrained invocation, or tool discovery. Do not require MCP just to wrap a command that the CLI already exposes cleanly. Use a native Hermes runtime or OpenAI-compatible gateway when Autocontext is calling Hermes as an agent provider. Use a Hermes plugin emitter only when the user specifically needs high-fidelity live traces beyond read-only import of existing Hermes artifacts. ## CLI Quick Start From a checkout of Autocontext: ```bash cd autocontext uv run autoctx --help ``` Inspect Hermes skill and curator state without modifying Hermes: ```bash uv run autoctx hermes inspect --json ``` For a custom profile or test fixture: ```bash uv run autoctx hermes inspect --home "$HERMES_HOME" --json ``` Install or refresh this skill into a Hermes profile: ```bash uv run autoctx hermes export-skill --output ~/.hermes/skills/autocontext/SKILL.md --json ``` If the file already exists and the user wants to replace it: ```bash uv run autoctx hermes export-skill --output ~/.hermes/skills/autocontext/SKILL.md --force --json ``` ## Running Autocontext From Hermes Use `--json` whenever Hermes needs to parse the result. ```bash RUN_ID="hermes_$(date +%s)" uv run autoctx run --scenario grid_ctf --gens 3 --run-id "$RUN_ID" --json uv run autoctx status "$RUN_ID" --json uv run autoctx replay "$RUN_ID" --generation 1 ``` For a plain-language task: ```bash uv run autoctx solve --description "Improve the support-triage response policy." --gens 3 --json ``` For one-shot judgment or improvement: ```bash uv run autoctx judge --task-prompt "..." --output "..." --rubric "..." --json uv run autoctx improve --task-prompt "..." --rubric "..." --rounds 3 --json ``` ## Hermes Runtime Configuration When Autocontext should call a Hermes-served model through an OpenAI-compatible gateway: ```bash export AUTOCONTEXT_AGENT_PROVIDER=openai-compatible export AUTOCONTEXT_AGENT_BASE_URL=http://localhost:8080/v1 export AUTOCONTEXT_AGENT_API_KEY=no-key export AUTOCONTEXT_AGENT_DEFAULT_MODEL=hermes-3-llama-3.1-8b uv run autoctx solve --description "..." --gens 3 --json ``` Keep provider configuration outside the skill when possible. The user or profile should own secrets, base URLs, and model names. ## Working With Hermes Curator Hermes v0.12 writes Curator reports under `~/.hermes/logs/curator/<timestamp>/run.json` and `REPORT.md`. It tracks skill usage in `~/.hermes/skills/.usage.json`, and protects bundled or hub-installed skills through `.bundled_manifest` and `.hub/lock.json`. Use: ```bash uv run autoctx hermes inspect --json ``` Read the output as an inventory: - `agent_created_skill_count` means Curator-eligible user or agent skills. - `bundled_skill_count` and `hub_skill_count` are upstream-owned skills and should not be pruned by Autocontext. - `pinned_skill_count` identifies skills Curator and agents should not modify. - `curator.latest.counts` summarizes the latest consolidation, pruning, and archive activity. Autocontext can use these signals for reports, datasets, and recommendations. Hermes Curator remains the writer for Hermes skill lifecycle changes. ## Privacy Before Session and Trajectory Ingest Curator decision reports are decision metadata and safe to import without redaction. Session and trajectory imports are different: they contain raw model prompts and responses, which may include secrets, tokens, or content the operator did not intend for external storage. Before recommending or running `autoctx hermes ingest-sessions` or `autoctx hermes ingest-trajectories`, explain the privacy tradeoff: the importer is read-only against `~/.hermes`, but the output JSONL contains the same content unless redaction is applied. Default is `--redact standard` (Anthropic/OpenAI keys, bearer tokens, emails, IPs, env values, paths, high-risk file refs). `--redact strict` adds user-defined regexes. `--redact off` writes raw content and the importer surfaces an explicit opt-in marker. Sessions in particular live in a SQLite store: an unwarranted ingest creates a new copy of every prompt and response. Prefer `--dry-run` first when the operator is unsure of the blast radius. ## Training Path For Autocontext-owned runs, export training data and train locally: ```bash uv run autoctx export-training-data --scenario grid_ctf --all-runs --output training/grid_ctf.jsonl uv run auto