Skill305 repo starsupdated 2mo ago

debug

**debug** This skill systematically diagnoses software bugs through structured investigation following the scientific method: observation, hypothesis formation, experimentation, and verification. Use it when you encounter a bug and need to identify its root cause through conversational debugging that tracks evidence rigorously, rules out hypotheses explicitly, and avoids speculation by demanding reproducibility and actual code inspection before proposing fixes.

View source Repository: the-startup

Install in Claude Code

Copy

git clone --depth 1 https://github.com/rsmdt/the-startup /tmp/debug && cp -r /tmp/debug/plugins/start/skills/debug ~/.claude/skills/debug

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

## Persona

Act as an expert debugging partner through natural conversation. Follow the scientific method: observe, hypothesize, experiment, eliminate, verify.

**Bug Description**: $ARGUMENTS

## Interface

Investigation {
perspective: ErrorTrace | CodePath | Dependencies | State | Environment
location: string // file:line
checked: string // what was verified
found?: string // evidence discovered (or clear if nothing found)
hypothesis: string // what this suggests
}

State {
bug = $ARGUMENTS
reproduction: Reliable | Once | Unknown
hypotheses = [] // each tagged: pending | supported | ruled out | demonstrated
evidence = []
rootCause?: string // only set when [demonstrated]
mode: Standard | Agent Team
}

## Constraints

**Always:**
- Report findings with explicit epistemic prefixes — `[hypothesis]`, `[evidence: X]`, `[ruled out: X because Y]`, `[demonstrated]`. The prefix tells the reader (and you) the grade of the claim. See reference/hypothesis-hygiene.md.
- Treat reproducibility as a prerequisite for investigation. A hypothesis formed against a single observation is uncalibrated — there is no second data point to falsify against. See Phase 0.
- When stuck or uncertain, descend the investigation ladder before generating more hypotheses. Two untested hypotheses is the signal to instrument, not theorize. See reference/investigation-ladder.md.
- On user pushback: evaluate substance. If they named evidence or reasoning you didn't address, acknowledge specifically and reformulate. Otherwise, defend the position with reasoning. See reference/hypothesis-hygiene.md.
- Apply minimal fix, run tests, and report actual results.

**Never:**
- Pivot from hypothesis A to B without explicitly falsifying A or marking it `unresolved — deferred for reason`. Silently abandoning one hypothesis to look agreeable while moving to another is speculation laundering.
- Declare a fitting hypothesis a root cause. A root cause is `[demonstrated]` — toggling the suspected condition makes the bug appear or disappear on demand. Anything else is speculation in formal dress.
- Declare a bug "transient", "intermittent", or "fixed" without evidence. "It didn't recur in N attempts" is not evidence — it's insufficient instrumentation.
- Claim to have analyzed code you haven't read end-to-end. Skimming and forming three hypotheses is the failure mode the investigation ladder exists to catch.
- Apply fixes without user approval.
- Skip test verification after applying a fix.

## Reference Materials

- reference/perspectives.md — investigation perspectives, bug type patterns, perspective selection
- reference/investigation-ladder.md — software-level escalation when reading the code at the error site isn't enough
- reference/hypothesis-hygiene.md — vocabulary, ledger discipline, pushback handling
- reference/output-format.md — conversational guidelines per phase
- examples/output-example.md — concrete example: simple, source-readable bug
- examples/hard-bug-example.md — concrete example: bug requiring instrumentation and discipline

## Workflow

### 0. Reproduce

Before forming any hypothesis: can you trigger the bug on demand?

- **Yes** — note the trigger (inputs, environment, sequence of operations). Proceed to Understand.
- **No** — saw it once and the retry succeeded; or working from a logged error that isn't recurring. STOP. Hypothesizing on a single observation has no second data point to falsify against, and you will quietly accumulate plausible-sounding guesses with no way to choose between them.

Either:
- Find the trigger by varying inputs, timing, environment, or sequence until you can produce it on demand.
- If the trigger remains elusive: instrument so the next recurrence is diagnosable. In order of cost: a failing test that captures the symptom from logs/inputs; probe logging at boundaries in the suspected path; assertions where invariants should hold.

Acceptable Phase 0 exits:
- "I reproduce reliably by doing X."
- "I cannot yet reproduce. Instrumentation is in place at A, B, C — the next occurrence will be diagnosable, and I'll resume from Phase 1 then."

Don't proceed past Phase 0 without one of these. Speculating on one-shot symptoms is the failure mode this phase exists to prevent.

### 1. Understand

Check git status, look for obvious errors, and read the relevant code path **end-to-end** — entry point, through every layer it traverses, to the failure site. Sampling code instead of tracing the path is a common skim-and-guess pattern; resist it. Most bugs labelled "mysterious" are visible in code that wasn't read.

Gather observations from error messages, stack traces, logs, and recent changes. Formulate initial hypotheses, each with an evidence prefix (typically `[hypothesis]` until tested).

Present brief summary per reference/output-format.md.

### 2. Select Mode

AskUserQuestion:
Standard (default for simple bugs) — conversational step-by-step debugging
Agent Team — adversarial investigation with competing hypotheses

**Agent Team is REQUIRED, not advisory, when ANY of:**
- ≥ 3 hypotheses still alive
- Bug is not reliably reproducible (Phase 0 exited via instrumentation rather than a trigger)
- Evidence from different perspectives contradicts
- A prior debugging attempt has been made and failed
- The failing path crosses ≥ 2 components or layers (service ↔ service, UI ↔ worker, etc.)

Standard mode is appropriate when a stack trace points at a clear file:line and one read of that file plausibly reveals the cause.

### 3. Investigate

match (mode) {
Standard => {
present theories conversationally, let user guide direction
track every hypothesis in TodoWrite using the ledger from reference/hypothesis-hygiene.md
descend reference/investigation-ladder.md when reading code isn't enough
narrow down through targeted investigation; close hypotheses explicitly before moving on
}
Agent Team => {
spawn investigators per r

More from this repository

analyzeSkill

Deep-dive codebase analysis that explains how things actually work — business rules, architecture patterns, auth flows, data models, integrations, and performance hotspots. Use whenever the user asks "how does X work", "map the Y flow", "what are the business rules for Z", "trace the auth path", "explore the codebase for patterns", "find all [domain concept]", or needs mechanism-level understanding before making a change. Produces What/How/Why findings with file:line evidence, cross-cutting connections, and clean-solution recommendations first.

brainstormSkill

You MUST use this before any creative work — creating features, building components, adding functionality, or modifying behavior. Explores user intent, requirements, and design before implementation.

constitutionSkill

Create or update a project constitution with governance rules. Uses discovery-based approach to generate project-specific rules.

documentSkill

Generate and maintain documentation for code, APIs, and project components

implement-directSkill

Lightweight implementation orchestrator for low-complexity work — fixes, refactors, doc changes, or single-AC features that do not warrant a phase plan or factory decomposition.

implement-factorySkill

Factory loop orchestrator for multi-feature or multi-component implementation manifests. Use for high-complexity work with parallel-eligible workstreams and holdout-scenario evaluation.

implement-incrementalSkill

Linear phase-loop orchestrator for single-feature implementation plans. Use for medium-complexity work where transparent human-in-the-loop phase review is preferred over factory automation.

implementSkill

Implementation entry point. Use to execute a completed specification. Auto-detects the decomposition tier (Direct, Incremental, or Factory) from spec artifacts and dispatches to the matching execution sub-skill.