Skip to main content
ClaudeWave
Skill1.3k repo starsupdated today

en

The en skill implements a three-layer information model for Anthropic agent skills, loading skill overviews into the system message initially, then retrieving full SKILL.md bodies and documentation on demand via skill_load and skill_select_docs calls. Use this when building agents that need to access reusable workflow packages while minimizing token costs through progressive disclosure rather than upfront injection of all skill content into context.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/trpc-group/trpc-agent-go /tmp/en && cp -r /tmp/en/docs/mkdocs/en ~/.claude/skills/en
Then start a new Claude Code session; the skill loads automatically.

skill.md

# Skill

Agent Skills package reusable workflows as folders with a `SKILL.md`
spec plus optional docs and scripts. During a conversation, the agent
injects a low‑cost “overview” first, then loads the full body/docs only
when actually needed, and runs scripts inside an isolated workspace.

Background references:
- Engineering blog:
  https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills
- Open Skills repository (structure to emulate):
  https://github.com/anthropics/skills

## Overview

### 🎯 What You Get

- 🔎 Overview injection (name + description) to guide selection
- 📥 `skill_load` to pull `SKILL.md` body and selected docs on demand
- 📚 `skill_select_docs` to add/replace/clear docs
- 🧾 `skill_list_docs` to list available docs
- 🧪 Execution: `skill_load` materializes a writable skill working copy
  under `/skills/<name>/`, and scripts are executed via `workspace_exec`
  (available whenever a code executor is configured)
- 🗂️ Output file collection via glob patterns with MIME detection
- 🧩 Pluggable local or container workspace executors (local by default)

### Three‑Layer Information Model

1) Initial “overview” (very low cost)
   - Inject only `name` and `description` from `SKILL.md` into the
     system message so the model knows what skills exist.

2) Full body (on demand)
   - When a task truly needs a skill, the model calls `skill_load`. The
     framework then materializes that skill’s full `SKILL.md` body into
     the next model request (see “Prompt Cache” below).

3) Docs/Scripts (selective + isolated execution)
   - Docs are included only when requested (via `skill_load` or
     `skill_select_docs`). Scripts are not inlined; they are executed
     inside a workspace, returning results and output files.

### Token Cost

If you inline a full skills repo (all `SKILL.md` bodies and docs) into
the prompt up-front, it can dominate your prompt-token budget and even
exceed the model context window.

For a reproducible, **runtime** token comparison (progressive disclosure
vs full injection), see [trpc-agent-go-benchmark/anthropic_skills/README.md](https://github.com/trpc-group/trpc-agent-go-benchmark/blob/main/anthropic_skills/README.md) and run
the `token-report` suite described there.

### Prompt Cache

Some model providers support **prompt caching**: if a later model request
starts with the exact same tokens as an earlier request, the provider
can reuse that shared **prefix** from cache. This reduces work and can
lower latency and/or input token cost (provider-dependent).

For Skills, *where* the loaded `SKILL.md` body/docs land in the message
sequence affects how long the shared prefix is:

- Legacy behavior (default): loaded skill bodies/docs are appended to the
  **system message**.
  - This inserts new tokens **before** the user message and history,
    which can shorten the shared prefix between consecutive model calls.
- Tool-result materialization (optional): loaded skill bodies/docs are
  appended to the matching **tool result** messages (`skill_load` /
  `skill_select_docs`).
  - This keeps the system message stable, so earlier messages are less
    likely to shift, and prompt caching can often reuse a longer prefix.

Fallback: if the matching tool result message is not present in the
request history (for example, history suppression), the framework can
fall back to a dedicated system message so the model still sees the
loaded content.

Session summary note: if you enable session summary injection
(`WithAddSessionSummary(true)`) and a summary is present, the framework
prefers to **skip** this fallback to avoid re-inflating the prompt with
summarized content. If the matching tool result messages are still
available, the fallback stays suppressed; if summary compaction removes
them, the fallback is re-enabled so the model still sees the full
body/docs.

Enable tool-result materialization with:
`llmagent.WithSkillsLoadedContentInToolResults(true)`.
To restore the legacy fallback behavior in summary mode:
`llmagent.WithSkipSkillsFallbackOnSessionSummary(false)`.

To measure the impact in a real tool-using flow, run the
[trpc-agent-go-benchmark/anthropic_skills](https://github.com/trpc-group/trpc-agent-go-benchmark/tree/main/anthropic_skills) `prompt-cache` suite.

How this relates to `SkillLoadMode` (common pitfall):

- The cache-prefix discussion above mostly applies to multiple model
  calls within the same `Runner.Run` (one user message triggers a tool
  loop).
- If you want loaded skill bodies/docs to persist across **multiple
  conversation turns**, set `SkillLoadMode` to `session`. The default
  `turn` mode clears the agent-scoped skill keys
  (`temp:skill:loaded_by_agent:<agent>/<name>` and
  `temp:skill:docs_by_agent:<agent>/<name>`) before the next run starts.
  As a result, even if your history still contains
  the previous `skill_load` tool result (typically a short `loaded:
  <name>` stub), the framework will not materialize the body/docs again.

Practical guidance (especially with
`WithSkillsLoadedContentInToolResults(true)`):

- First, be clear about *which* caching scenario you mean:
  - **Within a single turn** (multiple model calls inside one
    `Runner.Run`): `turn` and `session` behave similarly because both
    keep loaded content available within the run. The more important
    lever is usually *where* you materialize content (system vs tool
    result).
  - **Across turns**: `session` can be more prompt-cache friendly
    because you load once and avoid repeating `skill_load` every turn.
    The trade-off is a larger prompt and the need to manage clearing.
- Rule of thumb:
  - Default to `turn` (least-privilege, smaller prompts, less likely to
    trigger truncation/summary).
  - Use `session` only for a small number of skills you truly need
    across the whole session, and keep docs selection tight.
- Keep docs selection tight (avoid `include_all_docs` when possible).
  Otherwise the prompt can grow quickly