token-efficiency
The token-efficiency Claude Code skill reduces output token consumption by 40-60% through anti-sycophancy rules that eliminate verbose openings and closings, tool-call budgets that cap iterations by task type, one-pass coding discipline that completes solutions without incremental refinement, and task-specific profiles for coding versus agent work. Use this skill when sessions feel expensive, output contains unnecessary filler text, or Claude is repeatedly re-reading files and iterating unnecessarily.
git clone --depth 1 https://github.com/rohitg00/pro-workflow /tmp/token-efficiency && cp -r /tmp/token-efficiency/skills/token-efficiency ~/.claude/skills/token-efficiencySKILL.md
# Token Efficiency Reduce output token waste and prevent iteration cycles that consume context. ## Trigger Use when: - Sessions feel expensive or slow - Output is verbose with filler text - Claude is re-reading files or iterating unnecessarily - Setting up a new project for token-efficient work ## Anti-Sycophancy Rules These patterns waste 30-60% of output tokens: | Pattern | Example | Fix | |---------|---------|-----| | Sycophantic opener | "Sure! Great question!" | Delete. Lead with answer. | | Prompt restatement | "You're asking about X..." | Delete. Answer directly. | | Closing fluff | "Let me know if you need anything!" | Delete. Stop after the answer. | | Unsolicited suggestions | "You might also want to..." | Delete unless asked. | | AI disclaimers | "As an AI model..." | Delete entirely. | | Verbose preambles | "I'll help you with that..." | Delete. Start with the action. | ## Tool-Call Budgets Set explicit budgets by task complexity: | Task Type | Tool-Call Budget | Wrap-Up At | |-----------|-----------------|------------| | Quick fix / lookup | 20 calls | 15 | | Bug fix | 30 calls | 25 | | Feature (small) | 50 calls | 40 | | Feature (large) | 80 calls | 65 | | Refactor | 50 calls | 40 | | Exploration / research | 30 calls | 25 | At the wrap-up threshold: commit progress, assess remaining work, decide whether to continue or start fresh. ## One-Pass Coding Discipline For simple-to-medium tasks: 1. **Read all relevant files** including tests first 2. **Understand what tests assert** before coding 3. **Write complete solution in one pass** — not incrementally 4. **Run tests once** — if pass, STOP immediately 5. **If fail**: read the error, fix once, retest 6. **Never iterate** more than twice on the same failure — rethink approach 7. **Never refactor, improve, or polish passing code** ## Task Profiles Switch profiles based on what you're doing: ### Coding Profile - Return code first, explanation after (only if non-obvious) - Simplest working solution, no over-engineering - Read file before modifying — always - No docstrings on unchanged code - No error handling for impossible scenarios - State bug, show fix, stop ### Agent/Pipeline Profile - Structured output only: JSON, bullets, tables - No prose unless targeting a human reader - Every output must be parseable without post-processing - Execute task, do not narrate actions - Never invent file paths, API endpoints, or function names - If unknown: return null or "UNKNOWN", never guess ### Analysis Profile - Lead with finding, context and methodology after - Tables and bullets over prose - Numbers must include units - Never fabricate data points - Summary first (3 bullets max), caveats last ## Read-Before-Write Enforcement Hard rules: 1. **Never write a file you haven't read** in this session 2. **Never re-read a file** already read unless it was modified 3. **Read tests before coding** — understand what passes before writing 4. **Read error output carefully** before attempting a fix ## ASCII-Only Output Use ASCII characters only in all output: - `--` not `—` (em dash) - `"` not `"` `"` (smart quotes) - `'` not `'` `'` (curly apostrophes) - No emoji unless explicitly requested - No Unicode decorators or special characters This ensures clean copy-paste for code and compatibility with downstream systems. ## Measuring Impact Track these metrics to measure token savings: - **Output length**: average words per response (target: 30-50% reduction) - **Tool calls per task**: should stay within budget tier - **Re-read count**: should be near zero - **Write-without-read count**: should be zero - **Iteration cycles**: tests should pass in 1-2 attempts, not 5+ ## Attribution Token efficiency patterns adapted from [drona23/claude-token-efficient](https://github.com/drona23/claude-token-efficient) (MIT).
Analyzes and optimizes context window usage across sessions. Use when context feels bloated, sessions run slow, or approaching compaction limits.
Analyze session token usage and cost patterns. Identify expensive operations and recommend optimizations. Use to understand and reduce session costs.
Specialized debugging agent. Use when facing hard bugs, test failures, or runtime errors that need systematic investigation.
Multi-phase development agent. Research > Plan > Implement with validation gates. Use PROACTIVELY when building features that touch >5 files or require architecture decisions.
Analyze permission denial patterns and generate optimized alwaysAllow/alwaysDeny rules. Use when permission prompts slow down workflow.
Break down complex tasks into implementation plans before writing code. Use when task touches >5 files, requires architecture decisions, or has unclear requirements.
Code review specialist that verifies every finding against actual code before reporting. Use before committing, for PR reviews, or after major changes.
Confidence-gated exploration that assesses readiness before implementation. Scores 0-100 across five dimensions and gives GO/HOLD verdict.