Skip to main content
ClaudeWave
Skill313 estrellas del repoactualizado 2d ago

prompt-cache-agent-harness

# ClaudeWave Definition **prompt-cache-agent-harness** is a local design and inspection tool for structuring long-running Claude agent loops to leverage prompt caching efficiently. Use it when you need to split stable tool definitions, system instructions, and reference context into separate cacheable layers, keep dynamic memory and user-specific details outside the cached prefix to avoid invalidating it, inspect cache metadata from captured runs to compare cold-start versus warm-start behavior, and estimate token-cost impact using supplied Anthropic pricing data.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/Prompthon-IO/agent-systems-handbook /tmp/prompt-cache-agent-harness && cp -r /tmp/prompt-cache-agent-harness/skills/prompt-cache-agent-harness ~/.claude/skills/prompt-cache-agent-harness
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

# Prompt Cache Agent Harness

For the student-facing explanation of why this package exists, read
`README.md` first. This file is the invocation contract for Codex.

## Overview

Use this skill when the user is designing or debugging a long-running Claude
agent workflow where prompt caching is supposed to help. The goal is not to
call the Anthropic API directly from the skill. The goal is to keep the harness
local and inspectable:

1. plan a stable prompt spine
2. keep tools, system instructions, and long-lived history in explicit layers
3. keep durable memory and user-specific recall outside the cached prefix
4. read captured run artifacts after the agent loop executes
5. produce a small report on cache reads, cache writes, latency, and optional
   token-cost estimates

## When To Use

Use this skill for requests such as:

- design a prompt-cache layout for a persistent Claude agent
- separate cacheable static context from dynamic memory or retrieval facts
- compare cache reads and writes across cold and warm Claude runs
- estimate input-cost changes from supplied Anthropic pricing values
- explain why a cached prefix is being invalidated

Do not use this skill to promise provider-side savings without run artifacts.
If usage metadata is missing, say that and limit the answer to prompt-layout
review.

## Prompt Layout Rules

Use a layered prompt spine:

- `tools`: stable tool manifest, schemas, and capability boundaries
- `system`: stable role, policies, and operating instructions
- `reference`: stable project or domain context that changes rarely
- `session-summary`: compressed prior state only when it is intended to stay
  stable for many turns
- `dynamic-memory`: retrieved memories, current task facts, and user-specific
  details that should not invalidate the cached prefix
- `turn-input`: the current user request and latest volatile state

Prefer cache breakpoints after stable layers, not after dynamic memory. When
the provider supports explicit cache controls, apply them only to stable
content that can be reused safely.

## Expected Run Artifact

The helper reads one or more JSON or JSONL artifacts with fields like:

```json
{
  "label": "warm-run",
  "latency_ms": 2800,
  "input_tokens": 18000,
  "cache_creation_input_tokens": 1200,
  "cache_read_input_tokens": 14500,
  "output_tokens": 900,
  "stable_layer_hash": "tools-system-reference-v1",
  "dynamic_memory_hash": "retrieval-42",
  "notes": ["user-specific recall was appended after the cache boundary"]
}
```

Useful aliases are accepted for common captured metadata:

- `prompt_tokens` for `input_tokens`
- `cached_tokens` for `cache_read_input_tokens`
- `cache_write_tokens` for `cache_creation_input_tokens`
- `prefix_hash` for `stable_layer_hash`

## Local State And Outputs

Keep raw run artifacts outside the repo by default:

```text
~/.codex/state/prompt-cache-agent-harness/
  inputs/
  reports/
```

Do not commit transcripts, API responses, secrets, or provider logs unless the
user explicitly curates a redacted example for publication.

## Commands

Preview the helper:

```bash
python3 scripts/prompt_cache_report.py --help
```

Generate a Markdown report without pricing:

```bash
python3 scripts/prompt_cache_report.py \
  --input ~/.codex/state/prompt-cache-agent-harness/inputs/run-pair.jsonl \
  --output ~/.codex/state/prompt-cache-agent-harness/reports/cache-report.md
```

Generate a report with user-supplied pricing values:

```bash
python3 scripts/prompt_cache_report.py \
  --input ~/.codex/state/prompt-cache-agent-harness/inputs/run-pair.jsonl \
  --base-input-usd-per-mtok 3.00 \
  --cache-write-usd-per-mtok 3.75 \
  --cache-hit-usd-per-mtok 0.30 \
  --output ~/.codex/state/prompt-cache-agent-harness/reports/cache-report.md
```

## Interpretation Rules

- Treat changes to `stable_layer_hash` as the strongest cache-break signal.
- Treat changes to `dynamic_memory_hash` as expected only when dynamic memory is
  placed after the cache boundary.
- High cache writes with low cache reads usually means the workflow is paying
  setup cost without warm reuse.
- Strong cache reads on the warm run indicate the stable prefix is being reused.
- Keep durable memory separate from prompt caching; durable memory decides what
  to recall, while prompt caching rewards exact reusable prompt prefixes.

## Safety Boundaries

- Do not store sensitive conversation history in the cached prefix by default.
- Do not ask the user to expose Anthropic API keys to generate a report.
- Do not persist cost or trace reports in the repo unless the user asks.
- Do not turn a prompt-cache harness into an uncontrolled autonomous agent
  runner.

## Response Pattern

When reporting back, include:

- which layers should be cacheable
- cold versus warm cache-read share
- cache-write tokens versus cache-read tokens
- whether the stable layer hash changed
- whether dynamic memory appears to be outside the cached prefix
- cost estimates only when pricing values were supplied
customer-email-assistSkill

用 connector-first、最少 token 的方式审阅 Gmail 客户支持线程。适用于 Codex 需要通过 Codex Gmail connector 读取 Gmail、把清洗后的消息导入本地 SQLite 问题队列、先执行确定性的清洗和分类、只把模型调用保留给 JSON-only 的消息理解和草稿字段生成,并支持 dashboard 审阅、客户审批与排队回复处理的时候。

local-customer-email-replySkill

审阅客户来信,以本地政策或 FAQ 文档为依据起草安全的投诉和咨询回复。

agent-runtime-cache-benchmarkSkill

Compare two structured agent-run artifacts to estimate cache efficiency, explain likely cache breaks, and produce a local benchmark report. Use when a user wants to understand whether a prompt layout, tool manifest, or history shape is helping or hurting prompt-cache reuse.

daily-news-watcherSkill

Persistent daily news monitoring backed by local SQLite and Markdown reports. Use when a user asks Codex to track named publications, fetch the last N hours of news, summarize recent articles by topic, deduplicate articles across runs, or maintain a personal newsroom that survives across sessions.

garbage-collectorSkill

Scan local cleanup targets, apply readable cleanup rules, produce a preview report, and execute approved cleanup actions with logs. Use when a user asks Codex to clean up their computer, empty old Trash items, find duplicated Downloads files, review local storage clutter, or propose safe file cleanup actions before making changes.

local-document-organizerSkill

Preview-first local file organizer. Scan a user-named folder, classify files into category subfolders using readable rules, write a preview Markdown report and JSON plan, execute confirmed moves with persistent SQLite state, and reverse moves with undo. Use when a user asks Codex to organize Downloads, sort a messy folder into Invoices/Receipts/School/Images/Software/PDFs subfolders, propose a folder structure before moving anything, or undo a previous organization run.

personal-knowledge-captureSkill

Capture local or explicitly provided web knowledge sources into cited Markdown notes. Use when a user asks Codex to watch a research folder, register local folders for later scans, summarize new or modified local Markdown/TXT/PDF/DOCX files, capture a provided URL, maintain SQLite state for personal knowledge capture, or generate searchable source-grounded daily notes.

price-watcherSkill

Persistent product price tracking for natural-language product requests. Use when a user asks to watch, track, monitor, compare, or report prices for a product, especially with a target price or threshold such as "Watch MacBook Pro M3 14-inch and tell me if it drops below $1200." Supports source discovery, Playwright/browser product checks, SQLite history, threshold comparison, and Markdown price reports.