Skill139 repo starsupdated 5mo ago

bug-localization

The bug-localization Claude Code skill analyzes error messages, stack traces, failing tests, and source code patterns to pinpoint the precise location of bugs within applications and systems. Use this skill when debugging applications, investigating test failures, analyzing crash reports, tracing runtime issues, or performing root cause analysis to identify buggy functions, classes, files, or modules with confidence rankings and supporting evidence.

View source Repository: Skills-4-SE

Install in Claude Code

Copy

git clone --depth 1 https://github.com/ArabelaTso/Skills-4-SE /tmp/bug-localization && cp -r /tmp/bug-localization/skills/bug-localization ~/.claude/skills/bug-localization

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# Bug Localization

Precisely identify the location of bugs in source code by analyzing error messages, stack traces, failing tests, and code patterns. Provides ranked suspect locations with confidence scores and evidence.

## Core Capabilities

### 1. Error Analysis

Extract information from error sources:
- **Stack traces** - Parse and analyze call stacks
- **Error messages** - Interpret exception details
- **Failing tests** - Analyze test failures and assertions
- **Crash reports** - Process crash dumps and core files
- **Log messages** - Trace execution through logs
- **Debugger output** - Interpret breakpoint and watch data

### 2. Code Analysis

Examine code for bug indicators:
- **Data flow** - Trace variables and values
- **Control flow** - Analyze execution paths
- **Type mismatches** - Detect type-related issues
- **Null/undefined access** - Find potential null dereferences
- **Boundary violations** - Detect array/buffer overflows
- **Concurrency issues** - Identify race conditions

### 3. Suspect Ranking

Prioritize likely bug locations:
- **Confidence scores** - Rank suspects by likelihood
- **Evidence strength** - Quantify supporting evidence
- **Historical data** - Consider past bug patterns
- **Code complexity** - Factor in cyclomatic complexity
- **Recent changes** - Weigh recent modifications
- **Code churn** - Consider frequently modified code

### 4. Investigation Guidance

Provide actionable next steps:
- **Verification steps** - How to confirm the bug
- **Debugging strategies** - Where to set breakpoints
- **Test cases** - Tests to reproduce the bug
- **Related code** - Other potentially affected areas

## Bug Localization Workflow

### Step 1: Gather Evidence

Collect all available information:

**From stack trace:**
```python
Traceback (most recent call last):
  File "app.py", line 45, in process_order
    total = calculate_total(items)
  File "billing.py", line 23, in calculate_total
    price = item['price'] * item['quantity']
KeyError: 'price'
```

**Extract:**
- Error type: `KeyError`
- Missing key: `'price'`
- Exception location: `billing.py:23`
- Call chain: `app.py:45` → `billing.py:23`
- Function: `calculate_total`
- Context: Processing items dict

**From failing test:**
```python
FAILED tests/test_auth.py::test_login_with_valid_credentials
AssertionError: assert False is True
Expected: User logged in successfully
Actual: Login failed with invalid credentials
```

**Extract:**
- Test file: `tests/test_auth.py`
- Test function: `test_login_with_valid_credentials`
- Failure type: Assertion mismatch
- Expected behavior: Successful login
- Actual behavior: Failed login

### Step 2: Analyze Error Context

Understand what caused the error:

**For KeyError example:**
- **Direct cause**: Accessing non-existent 'price' key
- **Root causes** (hypotheses):
  1. Item dict missing 'price' field
  2. Key name mismatch ('price' vs 'Price')
  3. Item is None or wrong type
  4. Data corruption in item dict

**For login test failure:**
- **Direct cause**: Login returned False
- **Root causes** (hypotheses):
  1. Credential validation logic incorrect
  2. Database query failing
  3. Password hashing mismatch
  4. Session creation failure

### Step 3: Locate Suspect Code

Identify likely buggy locations:

**Primary suspects (KeyError):**
```
1. billing.py:23 (95% confidence)
   - Direct location of exception
   - Line: price = item['price'] * item['quantity']
   - Issue: No validation before dict access

2. app.py:40-45 (70% confidence)
   - Calls calculate_total with items
   - Possible: Items data structure incorrect
   - Need to verify items content

3. Data source (50% confidence)
   - Where items are created/loaded
   - Possible: Missing field in data
   - Check database schema or API response
```

**Code locations to examine:**
```python
# billing.py:20-25 (Primary suspect)
def calculate_total(items):
    total = 0
    for item in items:
        price = item['price'] * item['quantity']  # Line 23 - BUG HERE
        total += price
    return total

# app.py:40-45 (Secondary suspect)
def process_order(order_id):
    order = get_order(order_id)
    items = order.get('items', [])
    total = calculate_total(items)  # Line 45
    return total
```

### Step 4: Rank Suspects

Assign confidence scores:

**Ranking factors:**
- **Stack trace depth** - Closer to exception = higher confidence
- **Error message** - Directly mentioned code = higher
- **Code complexity** - More complex = more likely
- **Recent changes** - Recently modified = higher
- **Test coverage** - Low coverage = higher risk

**Example ranking:**
```
Rank 1: billing.py:23 in calculate_total() - 95%
  Evidence:
  - Direct exception location
  - No null/existence check before dict access
  - Simple fix: Add key validation

Rank 2: app.py:45 in process_order() - 70%
  Evidence:
  - Calls buggy function
  - items might be malformed
  - Check: order.get('items') might return bad data

Rank 3: models.py:78 in get_order() - 50%
  Evidence:
  - Data source for items
  - Possible missing fields in database
  - Check: Database schema and migrations

Rank 4: api.py:112 in create_order() - 30%
  Evidence:
  - Creates order data
  - Might not include all required fields
  - Check: API contract validation
```

### Step 5: Provide Investigation Plan

Guide debugging efforts:

**Immediate actions:**
1. Add validation in `billing.py:23`
2. Add logging before line 23 to inspect `item`
3. Check what `items` contains at `app.py:45`

**Verification steps:**
1. Add print: `print(f"Item: {item}")` before line 23
2. Run failing test again
3. Check if 'price' exists in item dict

**Long-term fixes:**
1. Add schema validation for items
2. Use type hints and static analysis
3. Add integration test for full order flow

## Localization Patterns

### Pattern 1: Stack Trace Analysis

**Error:**
```
Traceback (most recent call last):
  File "main.py", line 100, in run
    result = processor.execute()
  File "processor.py", line 45, in execute

More from this repository

abstract-domain-explorerSkill

Applies abstract interpretation using different abstract domains (intervals, octagons, polyhedra, sign, congruence) to statically analyze program variables and infer invariants, value ranges, and relationships. Use when analyzing program properties, inferring loop invariants, detecting potential errors, or understanding variable relationships through static analysis.

abstract-invariant-generatorSkill

Uses abstract interpretation to automatically infer loop invariants, function preconditions, and postconditions for formal verification. Generates invariants that capture program behavior and support correctness proofs in Dafny, Isabelle, Coq, and other verification systems. Use when adding formal specifications to code, generating verification conditions, inferring contracts for functions, or discovering loop invariants for proofs.

abstract-state-analyzerSkill

Performs abstract interpretation over source code to infer possible program states, variable ranges, and data properties without executing the program. Reports potential runtime errors including out-of-bounds accesses, null dereferences, type inconsistencies, division by zero, and integer overflows. Use when analyzing code for potential runtime errors, performing static analysis, checking safety properties, or verifying program behavior without execution.

abstract-trace-summarizerSkill

Performs abstract interpretation to produce summarized execution traces and high-level program behavior representations. Highlights key control flow paths, variable relationships, loop invariants, function summaries, and potential runtime states using abstract domains (intervals, signs, nullness, etc.). Use when analyzing program behavior, understanding execution paths, computing loop invariants, tracking variable ranges, detecting potential runtime errors, or generating program summaries without concrete execution.

acsl-annotation-assistantSkill

Create ACSL (ANSI/ISO C Specification Language) formal annotations for C/C++ programs. Use this skill when working with formal verification, adding function contracts (requires/ensures), loop invariants, assertions, memory safety annotations, or any ACSL specifications. Supports Frama-C verification and generates comprehensive formal specifications for C/C++ code.

agent-browserSkill

CLI-based browser automation with persistent page state using ref-based element interaction. Use when users ask to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.

ambiguity-detectorSkill

Detects and analyzes ambiguous language in software requirements and user stories. Use when reviewing requirements documents, user stories, specifications, or any software requirement text to identify vague quantifiers, unclear scope, undefined terms, missing edge cases, subjective language, and incomplete specifications. Provides detailed analysis with clarifying questions and suggested improvements.

api-design-assistantSkill

Design and review APIs with suggestions for endpoints, parameters, return types, and best practices. Use when designing new APIs from requirements, reviewing existing API designs, generating API documentation, or getting implementation guidance. Supports REST APIs with focus on endpoint structure, request/response schemas, authentication, pagination, filtering, versioning, and OpenAPI specifications. Triggers when users ask to design, review, document, or improve APIs.