Skip to main content
ClaudeWave
Skill89 estrellas del repoactualizado 1mo ago

debug-phase

Systematic debugging techniques including error classification, root cause analysis (5 Whys), reproduction strategies, and error documentation. Use when debugging errors, investigating failures, analyzing stack traces, fixing bugs, or documenting errors in error-log.md. (project)

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/marcusgoll/Spec-Flow /tmp/debug-phase && cp -r /tmp/debug-phase/.claude/skills/debug-phase ~/.claude/skills/debug-phase
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

<objective>
Provide systematic debugging techniques for error investigation, root cause analysis, and fix implementation. Ensures errors are properly classified, root causes identified (not symptoms), and solutions documented to prevent recurrence.
</objective>

<quick_start>
Debug errors systematically:

1. Reproduce error consistently (100% reliable reproduction)
2. Classify by type (syntax/runtime/logic/integration/performance) and severity (critical/high/medium/low)
3. Isolate root cause using binary search, logging, breakpoints, or 5 Whys
4. Implement fix with failing test first (TDD)
5. Add regression tests to prevent recurrence
6. Document in error-log.md with ERR-XXXX ID

**Key principle**: Fix root causes, not symptoms. Use 5 Whys to drill down.
</quick_start>

<prerequisites>
Before debugging:
- Error can be reproduced (even if intermittent, identify conditions)
- Logs available (application logs, stack traces, network logs)
- Development environment set up for testing

Check error-log.md for similar historical errors before starting.
</prerequisites>

<workflow>
<step number="1">
**Search for similar errors**

Check error-log.md for past occurrences:

```bash
# Search by error message
grep -i "connection timeout" error-log.md

# Search by component
grep "Component: StudentProgressService" error-log.md

# Search by error type
grep "Type: Integration" error-log.md
```

If similar error found:

- Review previous fix (workaround or root cause?)
- Check if error recurred (>2 occurrences = need permanent fix)
- Note patterns (same component, same conditions, timing)

**If recurring error**: Prioritize permanent fix over workaround.
</step>

<step number="2">
**Reproduce error consistently**

Create minimal reproduction:

1. Gather context: error message, stack trace, timestamp, user actions, environment
2. Create minimal test case that triggers error 100% of time
3. If intermittent: identify conditions (timing, data state, race conditions)

Example reproduction:

```bash
# API error
curl -X POST http://localhost:3000/api/endpoint \
  -H "Content-Type: application/json" \
  -d '{"field": "value"}'

# Frontend error
# 1. Navigate to /dashboard
# 2. Click "Load Data" button
# 3. Error appears in console
```

**Validation**: Can trigger error reliably before proceeding.
</step>

<step number="3">
**Classify error**

**By type**:

- **Syntax**: Code doesn't compile/parse (typos, missing brackets, linting errors)
- **Runtime**: Code runs but crashes (null pointer, type error, uncaught exception)
- **Logic**: Code runs but wrong result (calculation error, wrong branch taken)
- **Integration**: External dependency fails (API timeout, database connection, service unavailable)
- **Performance**: Code works but too slow (timeout, memory leak, N+1 queries)

**By severity**:

- **Critical**: Data loss, security breach, total system failure
- **High**: Feature broken, blocks users from core functionality
- **Medium**: Feature degraded, workaround exists
- **Low**: Minor UX issue, cosmetic, no functional impact

Example classification:

```markdown
Type: Integration (API call to external service fails)
Severity: High (dashboard doesn't load, blocks teachers)
Component: StudentProgressService.fetchExternalData()
Frequency: 30% of requests (intermittent)
```

See references/error-classification.md for detailed matrix.
</step>

<step number="4">
**Isolate root cause**

Use systematic techniques:

**Binary search** (for large codebases):

- Add logging at midpoint of suspected code
- If error before midpoint → investigate first half
- If error after midpoint → investigate second half
- Repeat until narrowed to specific function

**Increase logging**:

```python
# Add debug logs around suspected area
logger.debug(f"Before API call: student_id={student_id}, params={params}")
response = api.fetch_data(student_id)
logger.debug(f"After API call: status={response.status}, data_len={len(response.data)}")
```

**Use breakpoints** (interactive debugging):

```bash
# Python
python -m pdb script.py

# Node.js
node inspect server.js

# Browser (Chrome DevTools)
# Set breakpoints, inspect variables, step through code
```

**Check assumptions**:

- Is variable the value you expect? (log it, don't assume)
- Is function being called when you expect? (add entry/exit logs)
- Are external dependencies available? (health check endpoints)

**Apply 5 Whys** (drill down to root cause):

```markdown
1. Why did dashboard fail to load?
   → API call to external service timed out

2. Why did API call timeout?
   → Request took >30 seconds (timeout limit)

3. Why did request take >30 seconds?
   → External service response time was 45 seconds

4. Why was external service so slow?
   → Requesting too much data (1000 records instead of 10)

5. Why requesting 1000 records?
   → Missing pagination parameter in API call

Root cause: Missing pagination parameter causes over-fetching
```

**Validation**: Root cause identified (not just symptoms). Can explain why error occurred.
</step>

<step number="5">
**Implement fix (TDD approach)**

1. Write failing test first:

```python
def test_fetch_external_data_with_pagination():
    """Test that API call includes pagination parameter"""
    service = StudentProgressService()
    result = service.fetchExternalData(student_id=123)

    assert len(result) <= 10  # Max 10 records per page
    assert result.has_next_page  # Pagination metadata present
```

2. Implement fix:

```python
# Before (broken)
def fetchExternalData(self, student_id):
    return api.get(f"/students/{student_id}/data")

# After (fixed)
def fetchExternalData(self, student_id, page_size=10):
    return api.get(
        f"/students/{student_id}/data",
        params={"page_size": page_size}
    )
```

3. Verify fix:

```bash
# Run test (should pass)
pytest tests/test_student_progress_service.py

# Manually test reproduction steps (error should not occur)
```

**Validation**: Fix addresses root cause, all tests pass, error n