One CLAUDE.md file. Keeps Claude responses terse. Reduces output verbosity on heavy workflows. Drop-in, no code changes.
Claude-token-efficient is a single CLAUDE.md configuration file designed to suppress Claude Code's default verbosity and reduce output token consumption across repeated or automated workflows. The file instructs Claude to skip filler phrases like "Sure!" and "Absolutely!", avoid restating questions, omit unsolicited suggestions, and strip Unicode characters such as em dashes and smart quotes that can break downstream parsers. Users can either drop the file into a project directory for persistent, session-wide enforcement or paste a condensed rule set directly into chat for one-off sessions. Internal benchmarks across five prompts showed a 63% word reduction with no reported signal loss, and an independent head-to-head test against a competing configuration showed the v8 build cut costs by 17.4% across three coding challenges. The tool's own README is candid about the trade-off: because the file loads into context on every message, net savings only materialize when output volume is high enough to offset the recurring input token cost, making it most suitable for automation pipelines, agent loops, and high-volume code generation rather than casual single queries.
- ✓Open-source license (MIT)
- ✓Recently active
- ✓Healthy fork ratio
- ✓Clear description
- ✓Documented (README)
git clone https://github.com/drona23/claude-token-efficientTools overview
# claude-token-efficient
> One file. Drop it in your project. Keeps responses terse and can reduce total tokens on output-heavy workflows.
> Note: instruction files add input tokens on every turn. Keep this file short - if it grows too much, it can cost more than it saves.
> Model support: benchmarks were run on Claude only. The rules are model-agnostic and should work on any model that reads context - but results on local models like llama.cpp, Mistral, or others are untested. Community results welcome.
---
## The Problem
When you use Claude Code, every word Claude generates costs tokens.
Most people never control *how* Claude responds - they just get whatever the model decides to output.
By default, Claude:
- Opens every response with "Sure!", "Great question!", "Absolutely!"
- Ends with "I hope this helps! Let me know if you need anything!"
- Uses em dashes (--), smart quotes, Unicode characters that break parsers
- Restates your question before answering it
- Adds unsolicited suggestions beyond what you asked
- Over-engineers code with abstractions you never requested
- Agrees with incorrect statements ("You're absolutely right!")
**All of this wastes tokens. None of it adds value.**
---
## Two Options
**Option 1: Paste rules in chat (quick start)**
Copy these rules into any new session:
```
Rules: Read files first. Write complete solution. Test once. No over-engineering.
```
Works immediately. No setup. Good for one-off tasks.
**Option 2: Drop CLAUDE.md file (set and forget)**
```
your-project/
└── CLAUDE.md <- one file, zero setup, no code changes
```
Automatic on every message. Better for regular work. More efficient at scale.
Pick based on your workflow. Both work.
---
## How They Compare
| Approach | Setup | Cost | Best For |
|----------|-------|------|----------|
| Rules in chat | None | Higher | Quick sessions, no project |
| CLAUDE.md file | 1 file | Lower | Regular work, pipelines |
---
## When This Helps vs When It Doesn't
**This file works best for:**
- Automation pipelines with high output volume (resume bots, agent loops, code generation)
- Repeated structured tasks where Claude's default verbosity compounds across hundreds of calls
- Teams who need consistent, parseable output format across sessions
**This file is not worth it for:**
- Single short queries - the file loads into context on every message, so on low-output exchanges it is a net token increase
- Casual one-off use - the overhead doesn't pay off at low volume
- Fixing deep failure modes like hallucinated implementations or architectural drift - those require hooks, gates, and mechanical enforcement
- Pipelines using multiple fresh sessions per task - fresh sessions don't carry the CLAUDE.md overhead benefit the same way persistent sessions do
- Parser reliability at scale - if you need guaranteed parseable output, use structured outputs (JSON mode, tool use with schemas) built into the API - that is a more robust solution than prompt-based formatting rules
- Exploratory or architectural work where debate, pushback, and alternatives are the point - the override rule lets you ask for that any time, but if that's your primary workflow this file will feel restrictive
**The honest trade-off:**
The CLAUDE.md file itself consumes input tokens on every message. The savings come from reduced output tokens. The net is only positive when output volume is high enough to offset the persistent input cost. At low usage it costs more than it saves.
---
## Benchmark Results
Same 5 prompts. Run without CLAUDE.md (baseline) then with CLAUDE.md (optimized).
| Test | Baseline | Optimized | Reduction |
|------|----------|-----------|-----------|
| Explain async/await | 180 words | 65 words | 64% |
| Code review | 120 words | 30 words | 75% |
| What is a REST API | 110 words | 55 words | 50% |
| Hallucination correction | 55 words | 20 words | 64% |
| **Total** | **465 words** | **170 words** | **63%** |
**~295 words saved per 4 prompts. Same information. Zero signal loss.**
> **Methodology note:** This is a 5-prompt directional indicator (T1-T3, T5 for word reduction; T4 is a format test), not a statistically controlled study. Claude's output length varies naturally between identical prompts. No variance controls or repeated runs were applied. Treat the 63% as a directional signal for output-heavy use cases, not a precise universal measurement. The CLAUDE.md file itself adds input tokens on every message - net savings only apply when output volume is high enough to offset that persistent cost.
### External benchmark (Issue #1)
An [independent benchmark](https://github.com/adam-s/testing-claude-agent) ran 6 configs across 3 coding challenges (CSV reporter, SQLite window functions, Hono WebSocket counter). All configs passed all tests, so comparison was purely cost-to-green.
We ran our own v8 config head-to-head against C-structured (the previous best) on the same harness, same day, same model:
| Challenge | M-drona23-v8 | C-structured | Winner |
|-----------|-------------|-------------|--------|
| CSV Reporter | $0.244 | $0.282 | v8 |
| SQLite Windows | $0.406 | $0.376 | C-structured |
| WebSocket | $0.285 | $0.473 | v8 |
| **Total** | **$0.935** | **$1.131** | **v8 (-17.4%)** |
The v8 config uses 2 files (7 lines total). The biggest win comes from WebSocket where explicit pattern rules prevent expensive debugging loops.
This repo keeps the root `CLAUDE.md` to a small set of high-impact rules to minimize recurring input overhead.
### At Scale
| Usage | Tokens Saved/Day | Monthly Savings (Sonnet) |
|-------|-----------------|--------------------------|
| 100 prompts/day | ~9,600 tokens | ~$0.86 |
| 1,000 prompts/day | ~96,000 tokens | ~$8.64 |
| 3 projects combined | ~288,000 tokens | ~$25.92 |
---
## Before vs After
**Without CLAUDE.md - Code Review:**
```
Great question! I can see you're working with a loop. Let me take a look at that.
That's a thoughtful piece of code! However, I notice there's actually a subtle bug
here that's worth pointing out. The condition i <= arr.length should be i < arr.length.
Using <= means the loop will run one extra iteration... You might also want to consider
using modern array methods like forEach, map, or for...of which are cleaner...
Great catch on reviewing this - it's always good practice to double-check loop
boundaries! I hope this review was helpful. Feel free to share more code anytime!
```
**120 words.**
**With CLAUDE.md - Same prompt:**
```
Bug: <= causes an off-by-one error. On the final iteration arr[arr.length] is undefined.
Fix:
for (let i = 0; i < arr.length; i++)
```
**30 words. Same fix. 75% fewer tokens.**
---
## What It Fixes
| # | Problem | Fix |
|---|---------|-----|
| 1 | Starts coding without context | Think first; read files before writing |
| 2 | Verbose responses | Keep output concise |
| 3 | Rewrites large files unnecessarily | Prefer targeted edits |
| 4 | Re-reading the same files | Read each file once unless it changed |
| 5 | Declaring done without validation | Run tests before finishing |
| 6 | Sycophantic chatter | No flattering preamble/closing fluff |
| 7 | Over-engineered solutions | Favor simple direct fixes |
| 8 | Prompt conflict confusion | User instructions always override |
---
## Pro Tips From the Community
**Scope rules to your actual failure modes, not generic ones.**
Generic rules like "be concise" help but the real wins come from targeting specific failures you've actually hit. For example if Claude silently swallows errors in your pipeline, add a rule like: "when a step fails, stop immediately and report the full error with traceback before attempting any fix." Specific beats generic every time.
**CLAUDE.md files compose - use that.**
Claude reads multiple CLAUDE.md files at once - global (~/.claude/CLAUDE.md), project-level, and subdirectory-level. This means:
- Keep general preferences (tone, format, ASCII rules) in your global file
- Keep project-specific constraints ("never modify /config without confirmation") at the project level
- Keep task-specific rules in subdirectory files
This avoids bloating any single file and keeps rules close to where they apply.
---
## Profiles
Different project types need different levels of compression.
Pick the base file + a profile, or use the base alone.
| Profile | Best For |
|---------|----------|
| `CLAUDE.md` | Universal - works for any project |
| `profiles/CLAUDE.benchmark.md` | Token-to-green coding benchmarks |
| `profiles/CLAUDE.coding.md` | Dev projects, code review, debugging |
| `profiles/CLAUDE.agents.md` | Automation pipelines, multi-agent systems |
| `profiles/CLAUDE.analysis.md` | Data analysis, research, reporting |
### Versioned Configuration Sets
The `profiles/` directory also contains three versioned configuration sets representing different optimization strategies. Pick the one that matches your workflow:
| Version | Strategy | Tool Budget | Best For |
|---------|----------|-------------|----------|
| `J-drona23-v5` | Multi-file structured | 50 calls | Complex projects needing detailed workflow rules and agent definitions |
| `K-drona23-v6` | One-shot execution | 50 calls | Tasks that should complete in a single pass with minimal iteration |
| `M-drona23-v8` | Ultra-lean minimum-turn | 20 calls | Cost-sensitive pipelines where every tool call counts |
**How to choose:**
- Start with **v5** if you need structured multi-step workflows with clear agent protocols
- Use **v6** if you want faster execution with strict "done means done" rules (no polishing passing code)
- Use **v8** only if you need maximum cost efficiency and your tasks are simple enough for 20 tool calls
### Two Ways to Apply Rules
**Option A: CLAUDE.md file (recommended for regular use)**
- Drop file in project root
- Automatic on every message
- Cached efficiently
- Better for repeated tasks, pipelines
**Option B: Rules in prompt (for one-off sessions)**
- Copy-paste rules into chat
- Works wWhat people ask about claude-token-efficient
What is drona23/claude-token-efficient?
+
drona23/claude-token-efficient is tools for the Claude AI ecosystem. One CLAUDE.md file. Keeps Claude responses terse. Reduces output verbosity on heavy workflows. Drop-in, no code changes. It has 5.6k GitHub stars and was last updated 1mo ago.
How do I install claude-token-efficient?
+
You can install claude-token-efficient by cloning the repository (https://github.com/drona23/claude-token-efficient) or following the README instructions on GitHub. ClaudeWave also provides quick install blocks on this page.
Is drona23/claude-token-efficient safe to use?
+
Our security agent has analyzed drona23/claude-token-efficient and assigned a Trust Score of 92/100 (tier: Verified). See the full breakdown of passed checks and flags on this page.
Who maintains drona23/claude-token-efficient?
+
drona23/claude-token-efficient is maintained by drona23. The last recorded GitHub activity is from 1mo ago, with 2 open issues.
Are there alternatives to claude-token-efficient?
+
Yes. On ClaudeWave you can browse similar tools at /categories/tools, sorted by popularity or recent activity.
Deploy claude-token-efficient to your cloud
Ship this repo to production in minutes. Each platform spins up its own environment with editable env vars.
Maintain this repo? Add a badge to your README
Drop the badge into your GitHub README to show it's tracked on ClaudeWave. Each badge links back to this page and reflects the live Trust Score.
[](https://claudewave.com/repo/drona23-claude-token-efficient)<a href="https://claudewave.com/repo/drona23-claude-token-efficient"><img src="https://claudewave.com/api/badge/drona23-claude-token-efficient" alt="Featured on ClaudeWave: drona23/claude-token-efficient" width="320" height="64" /></a>More Tools
A single CLAUDE.md file to improve Claude Code behavior, derived from Andrej Karpathy's observations on LLM coding pitfalls.
An AI SKILL that provide design intelligence for building professional UI/UX multiple platforms
🪨 why use many token when few token do trick — Claude Code skill that cuts 65% of tokens by talking like caveman
AI coding assistant skill (Claude Code, Codex, OpenCode, Cursor, Gemini CLI, and more). Turn any folder of code, SQL schemas, R scripts, shell scripts, docs, papers, images, or videos into a queryable knowledge graph. App code + database schema + infrastructure in one graph.
A light-weight and powerful meta-prompting, context engineering and spec-driven development system for Claude Code by TÂCHES.
CLI proxy that reduces LLM token consumption by 60-90% on common dev commands. Single Rust binary, zero dependencies