Skill1.6k repo starsupdated today

en

The en skill implements a three-layer information model for Anthropic agent skills, loading skill overviews into the system message initially, then retrieving full SKILL.md bodies and documentation on demand via skill_load and skill_select_docs calls. Use this when building agents that need to access reusable workflow packages while minimizing token costs through progressive disclosure rather than upfront injection of all skill content into context.

View source Repository: trpc-agent-go

Install in Claude Code

Copy

git clone --depth 1 https://github.com/trpc-group/trpc-agent-go /tmp/en && cp -r /tmp/en/docs/mkdocs/en ~/.claude/skills/en

Then start a new Claude Code session; the skill loads automatically.

Definition

skill.md

# Skill

Agent Skills package reusable workflows as folders with a `SKILL.md`
spec plus optional docs and scripts. During a conversation, the agent
injects a low‑cost “overview” first, then loads the full body/docs only
when actually needed, and runs scripts inside an isolated workspace.

Background references:
- Engineering blog:
https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills
- Open Skills repository (structure to emulate):
https://github.com/anthropics/skills

## Overview

### 🎯 What You Get

- 🔎 Overview injection (name + description) to guide selection
- 📥 `skill_load` to pull `SKILL.md` body and selected docs on demand
- 📚 `skill_select_docs` to add/replace/clear docs
- 🧾 `skill_list_docs` to list available docs
- 🧪 Execution: `skill_load` materializes a writable skill working copy
under `/skills/<name>/`, and scripts are executed via `workspace_exec`
(available whenever a code executor is configured)
- 🗂️ Output file collection via glob patterns with MIME detection
- 🧩 Pluggable local or container workspace executors (local by default)

### Three‑Layer Information Model

1) Initial “overview” (very low cost)
- Inject only `name` and `description` from `SKILL.md` into the
system message so the model knows what skills exist.

2) Full body (on demand)
- When a task truly needs a skill, the model calls `skill_load`. The
framework then materializes that skill’s full `SKILL.md` body into
the next model request (see “Prompt Cache” below).

3) Docs/Scripts (selective + isolated execution)
- Docs are included only when requested (via `skill_load` or
`skill_select_docs`). Scripts are not inlined; they are executed
inside a workspace, returning results and output files.

### Token Cost

If you inline a full skills repo (all `SKILL.md` bodies and docs) into
the prompt up-front, it can dominate your prompt-token budget and even
exceed the model context window.

For a reproducible, **runtime** token comparison (progressive disclosure
vs full injection), see [trpc-agent-go-benchmark/anthropic_skills/README.md](https://github.com/trpc-group/trpc-agent-go-benchmark/blob/main/anthropic_skills/README.md) and run
the `token-report` suite described there.

### Prompt Cache

Some model providers support **prompt caching**: if a later model request
starts with the exact same tokens as an earlier request, the provider
can reuse that shared **prefix** from cache. This reduces work and can
lower latency and/or input token cost (provider-dependent).

For Skills, *where* the loaded `SKILL.md` body/docs land in the message
sequence affects how long the shared prefix is:

- Legacy behavior (default): loaded skill bodies/docs are appended to the
**system message**.
- This inserts new tokens **before** the user message and history,
which can shorten the shared prefix between consecutive model calls.
- Tool-result materialization (optional): loaded skill bodies/docs are
appended to the matching **tool result** messages (`skill_load` /
`skill_select_docs`).
- This keeps the system message stable, so earlier messages are less
likely to shift, and prompt caching can often reuse a longer prefix.

Fallback: if the matching tool result message is not present in the
request history (for example, history suppression), the framework can
fall back to a dedicated system message so the model still sees the
loaded content.

Session summary note: if you enable session summary injection
(`WithAddSessionSummary(true)`) and a summary is present, the framework
prefers to **skip** this fallback to avoid re-inflating the prompt with
summarized content. If the matching tool result messages are still
available, the fallback stays suppressed; if summary compaction removes
them, the fallback is re-enabled so the model still sees the full
body/docs.

Enable tool-result materialization with:
`llmagent.WithSkillsLoadedContentInToolResults(true)`.
To restore the legacy fallback behavior in summary mode:
`llmagent.WithSkipSkillsFallbackOnSessionSummary(false)`.

To measure the impact in a real tool-using flow, run the
[trpc-agent-go-benchmark/anthropic_skills](https://github.com/trpc-group/trpc-agent-go-benchmark/tree/main/anthropic_skills) `prompt-cache` suite.

How this relates to `SkillLoadMode` (common pitfall):

- The cache-prefix discussion above mostly applies to multiple model
calls within the same `Runner.Run` (one user message triggers a tool
loop).
- If you want loaded skill bodies/docs to persist across **multiple
conversation turns**, set `SkillLoadMode` to `session`. The default
`turn` mode clears the agent-scoped skill keys
(`temp:skill:loaded_by_agent:<agent>/<name>` and
`temp:skill:docs_by_agent:<agent>/<name>`) before the next run starts.
As a result, even if your history still contains
the previous `skill_load` tool result (typically a short `loaded:
<name>` stub), the framework will not materialize the body/docs again.

Practical guidance (especially with
`WithSkillsLoadedContentInToolResults(true)`):

- First, be clear about *which* caching scenario you mean:
- **Within a single turn** (multiple model calls inside one
`Runner.Run`): `turn` and `session` behave similarly because both
keep loaded content available within the run. The more important
lever is usually *where* you materialize content (system vs tool
result).
- **Across turns**: `session` can be more prompt-cache friendly
because you load once and avoid repeating `skill_load` every turn.
The trade-off is a larger prompt and the need to manage clearing.
- Rule of thumb:
- Default to `turn` (least-privilege, smaller prompts, less likely to
trigger truncation/summary).
- Use `session` only for a small number of skills you truly need
across the whole session, and keep docs selection tight.
- Keep docs selection tight (avoid `include_all_docs` when possible).
Otherwise the prompt can grow quickly