Skip to main content
ClaudeWave
Skill56.9k estrellas del repoactualizado 2d ago

debugging-and-error-recovery

This Claude Code skill provides a systematic debugging framework for identifying and fixing root causes of failures. Use it when tests fail, builds break, runtime behavior diverges from expectations, or unexpected errors occur, situations where structured diagnosis prevents time wasted on guessing and ensures errors don't compound into larger problems.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/addyosmani/agent-skills /tmp/debugging-and-error-recovery && cp -r /tmp/debugging-and-error-recovery/skills/debugging-and-error-recovery ~/.claude/skills/debugging-and-error-recovery
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

# Debugging and Error Recovery

## Overview

Systematic debugging with structured triage. When something breaks, stop adding features, preserve evidence, and follow a structured process to find and fix the root cause. Guessing wastes time. The triage checklist works for test failures, build errors, runtime bugs, and production incidents.

## When to Use

- Tests fail after a code change
- The build breaks
- Runtime behavior doesn't match expectations
- A bug report arrives
- An error appears in logs or console
- Something worked before and stopped working

## The Stop-the-Line Rule

When anything unexpected happens:

```
1. STOP adding features or making changes
2. PRESERVE evidence (error output, logs, repro steps)
3. DIAGNOSE using the triage checklist
4. FIX the root cause
5. GUARD against recurrence
6. RESUME only after verification passes
```

**Don't push past a failing test or broken build to work on the next feature.** Errors compound. A bug in Step 3 that goes unfixed makes Steps 4-10 wrong.

## The Triage Checklist

Work through these steps in order. Do not skip steps.

### Step 1: Reproduce

Make the failure happen reliably. If you can't reproduce it, you can't fix it with confidence.

```
Can you reproduce the failure?
├── YES → Proceed to Step 2
└── NO
    ├── Gather more context (logs, environment details)
    ├── Try reproducing in a minimal environment
    └── If truly non-reproducible, document conditions and monitor
```

**When a bug is non-reproducible:**

```
Cannot reproduce on demand:
├── Timing-dependent?
│   ├── Add timestamps to logs around the suspected area
│   ├── Try with artificial delays (setTimeout, sleep) to widen race windows
│   └── Run under load or concurrency to increase collision probability
├── Environment-dependent?
│   ├── Compare Node/browser versions, OS, environment variables
│   ├── Check for differences in data (empty vs populated database)
│   └── Try reproducing in CI where the environment is clean
├── State-dependent?
│   ├── Check for leaked state between tests or requests
│   ├── Look for global variables, singletons, or shared caches
│   └── Run the failing scenario in isolation vs after other operations
└── Truly random?
    ├── Add defensive logging at the suspected location
    ├── Set up an alert for the specific error signature
    └── Document the conditions observed and revisit when it recurs
```

For test failures:
```bash
# Run the specific failing test
npm test -- --grep "test name"

# Run with verbose output
npm test -- --verbose

# Run in isolation (rules out test pollution)
npm test -- --testPathPattern="specific-file" --runInBand
```

### Step 2: Localize

Narrow down WHERE the failure happens:

```
Which layer is failing?
├── UI/Frontend     → Check console, DOM, network tab
├── API/Backend     → Check server logs, request/response
├── Database        → Check queries, schema, data integrity
├── Build tooling   → Check config, dependencies, environment
├── External service → Check connectivity, API changes, rate limits
└── Test itself     → Check if the test is correct (false negative)
```

**Use bisection for regression bugs:**
```bash
# Find which commit introduced the bug
git bisect start
git bisect bad                    # Current commit is broken
git bisect good <known-good-sha> # This commit worked
# Git will checkout midpoint commits; run your test at each
git bisect run npm test -- --grep "failing test"
```

### Step 3: Reduce

Create the minimal failing case:

- Remove unrelated code/config until only the bug remains
- Simplify the input to the smallest example that triggers the failure
- Strip the test to the bare minimum that reproduces the issue

A minimal reproduction makes the root cause obvious and prevents fixing symptoms instead of causes.

### Step 4: Fix the Root Cause

Fix the underlying issue, not the symptom:

```
Symptom: "The user list shows duplicate entries"

Symptom fix (bad):
  → Deduplicate in the UI component: [...new Set(users)]

Root cause fix (good):
  → The API endpoint has a JOIN that produces duplicates
  → Fix the query, add a DISTINCT, or fix the data model
```

Ask: "Why does this happen?" until you reach the actual cause, not just where it manifests.

### Step 5: Guard Against Recurrence

Write a test that catches this specific failure:

```typescript
// The bug: task titles with special characters broke the search
it('finds tasks with special characters in title', async () => {
  await createTask({ title: 'Fix "quotes" & <brackets>' });
  const results = await searchTasks('quotes');
  expect(results).toHaveLength(1);
  expect(results[0].title).toBe('Fix "quotes" & <brackets>');
});
```

This test will prevent the same bug from recurring. It should fail without the fix and pass with it.

### Step 6: Verify End-to-End

After fixing, verify the complete scenario:

```bash
# Run the specific test
npm test -- --grep "specific test"

# Run the full test suite (check for regressions)
npm test

# Build the project (check for type/compilation errors)
npm run build

# Manual spot check if applicable
npm run dev  # Verify in browser
```

## Error-Specific Patterns

### Test Failure Triage

```
Test fails after code change:
├── Did you change code the test covers?
│   └── YES → Check if the test or the code is wrong
│       ├── Test is outdated → Update the test
│       └── Code has a bug → Fix the code
├── Did you change unrelated code?
│   └── YES → Likely a side effect → Check shared state, imports, globals
└── Test was already flaky?
    └── Check for timing issues, order dependence, external dependencies
```

### Build Failure Triage

```
Build fails:
├── Type error → Read the error, check the types at the cited location
├── Import error → Check the module exists, exports match, paths are correct
├── Config error → Check build config files for syntax/schema issues
├── Dependency error → Check package.json, run npm install
└── Environment error → Check Node version, OS compatibility
```