Skip to main content
ClaudeWave
Skill80 estrellas del repoactualizado today

compact-memory-implementation

Developer implementation guide for adding compact memory to an Agent — covers fork agent pattern for compaction, trigger strategy, summary format design, and memory restoration in subsequent sessions. Use when a developer asks how to implement compact memory, context compression, or memory persistence in their agent built with Claude Agent SDK or Anthropic API.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/simbajigege/book2skills /tmp/compact-memory-implementation && cp -r /tmp/compact-memory-implementation/skills/compact-memory-implementation ~/.claude/skills/compact-memory-implementation
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

# compact-memory-implementation

A developer guide for building compact memory into an Agent: detect when to compress, fork a compactor sub-agent, produce a structured summary, and restore it in the next session.

## Step 1 — Understand the setup

Before designing anything, clarify:

- **SDK / language**: Claude Agent SDK? Direct Anthropic API? Python or TypeScript?
- **Agent architecture**: single-agent loop, multi-agent, tool-calling?
- **Session model**: one long-running session or multiple short sessions?
- **What must survive compaction**: task state, decisions, tool results, conversation history?

This determines which pattern fits.

---

## Step 2 — When to trigger compact

Three strategies, pick based on your session model:

**1. Token threshold** (recommended)
Check `usage.input_tokens` from the previous response. When it exceeds ~70–80% of your model's context limit, trigger compact.

```python
COMPACT_THRESHOLD = 150_000  # adjust per model

if response.usage.input_tokens > COMPACT_THRESHOLD:
    compact = compact_memory(history)
    history = []  # reset — compact moves to system prompt
```

**2. Turn count**
Compact every N turns. Simpler but less adaptive — misses sessions with a few very long turns.

```python
COMPACT_EVERY_N = 30

if turn_count % COMPACT_EVERY_N == 0:
    compact = compact_memory(history)
```

**3. Phase boundary**
Compact at natural task boundaries (after research, before implementation). Requires the agent to detect phases. Produces summaries that align with meaningful milestones, but harder to implement reliably.

**Recommended default**: token threshold at 70%, with turn-count fallback at N=40.

---

## Step 3 — Fork agent for compaction

The compactor is a **separate agent call** whose only job is to read the current state and return a structured summary. Fork it synchronously — the main agent waits for the result before continuing.

```python
def compact_memory(history: list[dict]) -> dict:
    response = client.messages.create(
        model="claude-haiku-4-5-20251001",  # cheaper model is fine for compaction
        max_tokens=4096,
        system=COMPACTOR_SYSTEM_PROMPT,
        messages=[
            {
                "role": "user",
                "content": format_history_for_compact(history),
            }
        ],
    )
    return json.loads(response.content[0].text)
```

**Why fork instead of self-compact:**
- The main agent may have drifted in focus; the compactor starts fresh with the full picture
- Compaction is a different cognitive task — summarizing vs. executing
- A cheaper, smaller model (Haiku) can do compaction; save the expensive model for main work
- Clean separation makes the compact output easier to validate and test

---

## Step 4 — How to compact: format and prompt

### Compact output schema

```json
{
  "task": "What the agent is working on and why — the goal, not the steps",
  "current_state": "Exact status at compaction point: what is done, what is not, what is in progress",
  "key_decisions": [
    { "decision": "...", "reason": "...", "constraint": "..." }
  ],
  "eliminated_approaches": [
    { "approach": "...", "reason_ruled_out": "..." }
  ],
  "open_questions": ["..."],
  "next_steps": ["..."],
  "relevant_tool_results": {
    "key": "Only results future steps will need — summarized, not raw dumps"
  },
  "compacted_at_turn": 42
}
```

### Compactor system prompt

```
You are a conversation compactor. Read the provided conversation and produce a JSON summary that captures everything a fresh agent needs to continue the work without asking what happened.

Include:
- Current task and goal (not the steps taken to get here)
- Exact current state — what is done and what is not
- Decisions made and WHY (reasoning, not just the choice)
- Approaches tried and ruled out with reasons (prevents re-exploration)
- Open questions and blockers
- Concrete next steps in priority order
- Tool results that future steps will need (summarize, don't dump raw output)

Omit:
- Intermediate reasoning that led nowhere
- Completed sub-tasks with no future relevance
- Raw tool output that has already been acted on
- Anything derivable by reading the code or running a command

Output valid JSON matching the schema provided. No prose outside the JSON.
```

### Format history for compactor

```python
def format_history_for_compact(history: list[dict]) -> str:
    lines = ["Conversation to compact:\n"]
    for msg in history:
        role = msg["role"].upper()
        content = msg["content"] if isinstance(msg["content"], str) else "[tool use]"
        lines.append(f"[{role}]: {content[:2000]}")  # cap very long messages
    return "\n".join(lines)
```

---

## Step 5 — How to use after compacting: memory restoration

The compact object becomes the "memory" for the next turn or session. Inject it into the system prompt so it's always visible to the agent.

### Pattern A — System prompt injection (recommended)

```python
MEMORY_BLOCK_TEMPLATE = """
## Restored memory (compacted at turn {turn})

**Task**: {task}

**Current state**: {current_state}

**Key decisions**:
{decisions}

**Ruled out approaches**:
{eliminated}

**Next steps**:
{next_steps}

Begin from current state above. Do not re-explore eliminated approaches.
"""

def build_system_with_memory(base_system: str, compact: dict | None) -> str:
    if compact is None:
        return base_system
    memory = MEMORY_BLOCK_TEMPLATE.format(
        turn=compact["compacted_at_turn"],
        task=compact["task"],
        current_state=compact["current_state"],
        decisions="\n".join(f"- {d['decision']} (because {d['reason']})"
                            for d in compact["key_decisions"]),
        eliminated="\n".join(f"- {e['approach']}: {e['reason_ruled_out']}"
                             for e in compact["eliminated_approaches"]),
        next_steps="\n".join(f"- {s}" for s in compact["next_steps"]),
    )
    return base_system + "\n\n" + memory
```

### Pattern B — First message inj
agent-memory-implementationSkill

Restructures a chaotic or overgrown MEMORY.md into a clean 2-layer architecture based on how Claude Code's autoDream system organizes memory — a lightweight pointer index (always loaded) and topic files (loaded on demand). Stale or superseded memories are deleted or corrected in place — not archived. Use this skill whenever the user says \"clean up MEMORY.md\", \"reorganize my memory files\", \"MEMORY.md is getting too long\", \"fix my memory structure\", or when you observe that MEMORY.md exceeds 200 lines, contains full paragraphs instead of pointers, or mixes index entries with topic content.

analyzing-financial-statementsSkill

>

business-adventures-analysis-brooksSkill

Use Business Adventures for "why did this fail?", "analyze this crisis", "what pattern applies?", or "what would Brooks notice?

clash-cultures-investment-speculation-bogleSkill

Apply John Bogle stewardship capitalism logic to separate investing from

common-sense-index-investing-bogleSkill

Apply John Bogle index investing rules for low-cost funds, asset allocation,

contagious-viral-content-bergerSkill

Apply Jonah Berger''s STEPPS framework. Trigger on: "why is this not spreading?", "make this campaign contagious", "diagnose viral content".

contract-drafting-and-review-guidance-china-lawSkill

Apply China contract drafting review with San Guan Si Bu Fa. Trigger on contract review, drafting, clauses, or deal structure.

creating-superfans-hodakSkill

Apply Brittany Hodak's SUPER Model for customer loyalty, referrals, word of mouth, personalization, service recovery, and scalable customer experience.