Skip to main content
ClaudeWave
Skill894 repo starsupdated 2d ago

context-compression

Context compression is a skill for managing extremely long agent sessions by strategically summarizing conversation history to minimize total tokens consumed across task completion, including re-fetch costs from information loss. Use this skill when agent sessions exceed context windows, codebases span millions of tokens, designing conversation summarization strategies, or creating durable handoff summaries that preserve decisions and file modifications. The skill provides three approaches: anchored iterative summarization for file-tracking sessions, opaque compression for short sessions requiring maximum token savings, and streaming compression for real-time applications.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/guanyang/open-agent-hub /tmp/context-compression && cp -r /tmp/context-compression/skills/context-compression ~/.claude/skills/context-compression
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Context Compression Strategies

When agent sessions generate millions of tokens of conversation history, compression becomes mandatory. The naive approach is aggressive compression to minimize tokens per request. The correct optimization target is tokens per task: total tokens consumed to complete a task, including re-fetching costs when compression loses critical information.

## When to Activate

Activate this skill when:
- Agent sessions exceed context window limits
- Codebases exceed context windows (5M+ token systems)
- Designing conversation summarization strategies
- Debugging cases where agents "forget" what files they modified
- Building evaluation frameworks for compression quality
- Creating durable handoff summaries that preserve decisions, files, risks, and next actions

Do not activate this skill for adjacent work owned by other skills:
- General token-efficiency tactics such as masking, prefix caching, or partitioning: `context-optimization`.
- Diagnosing why a long context is failing before choosing a mitigation: `context-degradation`.
- Writing raw outputs, logs, or plans to files without summarizing them: `filesystem-context`.
- Designing long-term semantic memory across sessions: `memory-systems`.

## Core Concepts

Context compression trades token savings against information loss. Select from three production-ready approaches based on session characteristics:

1. **Anchored Iterative Summarization**: Implement this for long-running sessions where file tracking matters. Maintain structured, persistent summaries with explicit sections for session intent, file modifications, decisions, and next steps. When compression triggers, summarize only the newly-truncated span and merge with the existing summary rather than regenerating from scratch. This prevents drift that accumulates when summaries are regenerated wholesale — each regeneration risks losing details the model considers low-priority but the task requires. Structure forces preservation because dedicated sections act as checklists the summarizer must populate, catching silent information loss.

2. **Opaque Compression**: Reserve this for short sessions where re-fetching costs are low and maximum token savings are required. It produces compressed representations optimized for reconstruction fidelity, achieving 99%+ compression ratios but sacrificing interpretability entirely. The tradeoff matters: there is no way to verify what was preserved without running probe-based evaluation, so never use this when debugging or artifact tracking is critical.

3. **Regenerative Full Summary**: Use this when summary readability is critical and sessions have clear phase boundaries. It generates detailed structured summaries on each compression trigger. The weakness is cumulative detail loss across repeated cycles — each full regeneration is a fresh pass that may deprioritize details preserved in earlier summaries.

## Detailed Topics

### Optimize for Tokens-Per-Task, Not Tokens-Per-Request

Measure total tokens consumed from task start to completion, not tokens per individual request. When compression drops file paths, error messages, or decision rationale, the agent must re-explore, re-read files, and re-derive conclusions — wasting far more tokens than the compression saved. A strategy saving 0.5% more tokens per request but causing 20% more re-fetching costs more overall. Track re-fetching frequency as the primary quality signal: if the agent repeatedly asks to re-read files it already processed, compression is too aggressive.

### Solve the Artifact Trail Problem First

Artifact trail integrity is often the weakest dimension in compression evaluations (claim-context-compression-factory-benchmark). Address this proactively because general summarization cannot reliably maintain it.

Preserve these categories explicitly in every compression cycle:
- Which files were created (full paths)
- Which files were modified and what changed (include function names, not just file names)
- Which files were read but not changed
- Specific identifiers: function names, variable names, error messages, error codes

Implement a separate artifact index or explicit file-state tracking in agent scaffolding rather than relying on the summarizer to capture these details. Even structured summarization with dedicated file sections struggles with completeness over long sessions.

### Structure Summaries with Mandatory Sections

Build structured summaries with explicit sections that prevent silent information loss. Each section acts as a checklist the summarizer must populate, making omissions visible rather than silent.

```markdown
## Session Intent
[What the user is trying to accomplish]

## Files Modified
- auth.controller.ts: Fixed JWT token generation
- config/redis.ts: Updated connection pooling
- tests/auth.test.ts: Added mock setup for new config

## Decisions Made
- Using Redis connection pool instead of per-request connections
- Retry logic with exponential backoff for transient failures

## Current State
- 14 tests passing, 2 failing
- Remaining: mock setup for session service tests

## Next Steps
1. Fix remaining test failures
2. Run full test suite
3. Update documentation
```

Adapt sections to the agent's domain. A debugging agent needs "Root Cause" and "Error Messages"; a migration agent needs "Source Schema" and "Target Schema." The structure matters more than the specific sections — any explicit schema outperforms freeform summarization.

### Choose Compression Triggers Strategically

When to trigger compression matters as much as how to compress. Select a trigger strategy based on session predictability:

| Strategy | Trigger Point | Trade-off |
|----------|---------------|-----------|
| Fixed threshold | 70-80% context utilization | Simple but may compress too early |
| Sliding window | Keep last N turns + summary | Predictable context size |
| Importance-based | Compress low-relevance sections first | Complex but preserves signal |