test-driven-development
# Test-Driven Development This Claude Code skill implements the test-driven development (TDD) cycle for writing and maintaining code by following a three-step process: writing a failing test first (RED), writing minimal code to pass the test (GREEN), and refactoring while keeping tests passing (REFACTOR). Use this skill when implementing new logic, fixing bugs using the Prove-It Pattern, modifying existing functionality, or making any change that could affect behavior. For bug fixes specifically, write a test that reproduces the bug before attempting the fix, ensuring code changes are validated and documented through executable tests rather than assumptions.
git clone --depth 1 https://github.com/addyosmani/agent-skills /tmp/test-driven-development && cp -r /tmp/test-driven-development/skills/test-driven-development ~/.claude/skills/test-driven-developmentSKILL.md
# Test-Driven Development
## Overview
Write a failing test before writing the code that makes it pass. For bug fixes, reproduce the bug with a test before attempting a fix. Tests are proof — "seems right" is not done. A codebase with good tests is an AI agent's superpower; a codebase without tests is a liability.
## When to Use
- Implementing any new logic or behavior
- Fixing any bug (the Prove-It Pattern)
- Modifying existing functionality
- Adding edge case handling
- Any change that could break existing behavior
**When NOT to use:** Pure configuration changes, documentation updates, or static content changes that have no behavioral impact.
**Related:** For browser-based changes, combine TDD with runtime verification using Chrome DevTools MCP — see the Browser Testing section below.
## The TDD Cycle
```
RED GREEN REFACTOR
Write a test Write minimal code Clean up the
that fails ──→ to make it pass ──→ implementation ──→ (repeat)
│ │ │
▼ ▼ ▼
Test FAILS Test PASSES Tests still PASS
```
### Step 1: RED — Write a Failing Test
Write the test first. It must fail. A test that passes immediately proves nothing.
```typescript
// RED: This test fails because createTask doesn't exist yet
describe('TaskService', () => {
it('creates a task with title and default status', async () => {
const task = await taskService.createTask({ title: 'Buy groceries' });
expect(task.id).toBeDefined();
expect(task.title).toBe('Buy groceries');
expect(task.status).toBe('pending');
expect(task.createdAt).toBeInstanceOf(Date);
});
});
```
### Step 2: GREEN — Make It Pass
Write the minimum code to make the test pass. Don't over-engineer:
```typescript
// GREEN: Minimal implementation
export async function createTask(input: { title: string }): Promise<Task> {
const task = {
id: generateId(),
title: input.title,
status: 'pending' as const,
createdAt: new Date(),
};
await db.tasks.insert(task);
return task;
}
```
### Step 3: REFACTOR — Clean Up
With tests green, improve the code without changing behavior:
- Extract shared logic
- Improve naming
- Remove duplication
- Optimize if necessary
Run tests after every refactor step to confirm nothing broke.
## The Prove-It Pattern (Bug Fixes)
When a bug is reported, **do not start by trying to fix it.** Start by writing a test that reproduces it.
```
Bug report arrives
│
▼
Write a test that demonstrates the bug
│
▼
Test FAILS (confirming the bug exists)
│
▼
Implement the fix
│
▼
Test PASSES (proving the fix works)
│
▼
Run full test suite (no regressions)
```
**Example:**
```typescript
// Bug: "Completing a task doesn't update the completedAt timestamp"
// Step 1: Write the reproduction test (it should FAIL)
it('sets completedAt when task is completed', async () => {
const task = await taskService.createTask({ title: 'Test' });
const completed = await taskService.completeTask(task.id);
expect(completed.status).toBe('completed');
expect(completed.completedAt).toBeInstanceOf(Date); // This fails → bug confirmed
});
// Step 2: Fix the bug
export async function completeTask(id: string): Promise<Task> {
return db.tasks.update(id, {
status: 'completed',
completedAt: new Date(), // This was missing
});
}
// Step 3: Test passes → bug fixed, regression guarded
```
## The Test Pyramid
Invest testing effort according to the pyramid — most tests should be small and fast, with progressively fewer tests at higher levels:
```
╱╲
╱ ╲ E2E Tests (~5%)
╱ ╲ Full user flows, real browser
╱──────╲
╱ ╲ Integration Tests (~15%)
╱ ╲ Component interactions, API boundaries
╱────────────╲
╱ ╲ Unit Tests (~80%)
╱ ╲ Pure logic, isolated, milliseconds each
╱──────────────────╲
```
**The Beyonce Rule:** If you liked it, you should have put a test on it. Infrastructure changes, refactoring, and migrations are not responsible for catching your bugs — your tests are. If a change breaks your code and you didn't have a test for it, that's on you.
### Test Sizes (Resource Model)
Beyond the pyramid levels, classify tests by what resources they consume:
| Size | Constraints | Speed | Example |
|------|------------|-------|---------|
| **Small** | Single process, no I/O, no network, no database | Milliseconds | Pure function tests, data transforms |
| **Medium** | Multi-process OK, localhost only, no external services | Seconds | API tests with test DB, component tests |
| **Large** | Multi-machine OK, external services allowed | Minutes | E2E tests, performance benchmarks, staging integration |
Small tests should make up the vast majority of your suite. They're fast, reliable, and easy to debug when they fail.
### Decision Guide
```
Is it pure logic with no side effects?
→ Unit test (small)
Does it cross a boundary (API, database, file system)?
→ Integration test (medium)
Is it a critical user flow that must work end-to-end?
→ E2E test (large) — limit these to critical paths
```
## Writing Good Tests
### Test State, Not Interactions
Assert on the *outcome* of an operation, not on which methods were called internally. Tests that verify method call sequences break when you refactor, even if the behavior is unchanged.
```typescript
// Good: Tests what the function does (state-based)
it('returns tasks sorted by creation date, newest first', async () => {
const tasks = await listTasks({ sortBy: 'createdAt', sortOrder: 'desc' });
expect(tasks[0].createdAt.getTime())
.toBeGreaterThan(tasks[1].createdAt.getTime());
});
// Bad: Tests how the function works internally (interaction-based)
it('calls db.query with ORDER BY created_at DESC', async () => {Implement tasks incrementally — build, test, verify, commit. Add "auto" to run the whole plan in one approved pass.
Simplify code for clarity and maintainability — reduce complexity without changing behavior
Break work into small verifiable tasks with acceptance criteria and dependency ordering
Conduct a five-axis code review — correctness, readability, architecture, security, performance
Run the pre-launch checklist via parallel fan-out to specialist personas, then synthesize a go/no-go decision
Start spec-driven development — write a structured specification before writing code
Run TDD workflow — write failing tests, implement, verify. For bugs, use the Prove-It pattern.
Senior code reviewer that evaluates changes across five dimensions — correctness, readability, architecture, security, and performance. Use for thorough code review before merge.