Skill66 repo starsupdated 29d ago

diagnose

Diagnostic pipeline for complex/intermittent bugs. Uses diagnostics roles for Investigation, Verification, and Solution before Lead Programmer handoff. Use ONLY for non-obvious failures (root cause unclear, reproduction unstable, fixes reverted). NOT for trivial bugs with known cause — fix them directly.

View source Repository: software_development_department

Install in Claude Code

Copy

git clone --depth 1 https://github.com/tranhieutt/software_development_department /tmp/diagnose && cp -r /tmp/diagnose/.claude/skills/diagnose ~/.claude/skills/diagnose

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# Skill: /diagnose — Complex Bug Diagnostic Pipeline

## When to invoke (and when NOT to)

### Use `/diagnose` when:
- Bug reproduces but **root cause is unclear** after one read-pass of the failing code
- Previous fix attempts have been **reverted ≥ 2 times** (symptoms return)
- Failure is **intermittent** (flaky test, race condition, timing-dependent)
- Failure occurs in **unfamiliar code** (agent has no prior context)
- User has explicitly requested `/diagnose` or "deep investigation"
- Circuit Breaker (Rule 14) tripped on the specialist agent that normally handles this domain

### Do NOT use `/diagnose` when:
- Cause is obvious (null ref, typo, missing import, incorrect import path)
- Fix is < 10 LOC and has a clear success check
- Bug is in code you just wrote this session (read-pass + local reasoning is faster)
- User wants a quick patch and has accepted the tradeoff

## Pipeline overview

```
Feedback Loop -> Investigation -> Verification -> Solution -> Lead Programmer
  (signal)         (hypothesis)    (devil's adv.)   (tradeoffs)   (assign + exec)

  repro/check command      investigation.json      verification.json      solution.json          implementation
  (fast deterministic      (root_cause,           (status: confirmed |    (3 options:           (delegates to
   pass/fail signal)       evidence[],            refuted | inconclusive, Quick/Strategic/     backend-developer,
                           confidence)            reproduction_steps)    Future-Proof)         qa-engineer, etc.)
```

Each stage produces a **required artifact** saved to `.investigations/<task_id>/` and a **handoff contract** (per Rule 16) to the next agent.

## Stage 0 — Feedback Loop

**Goal:** Build the fastest reliable pass/fail signal for the exact symptom
before explaining the cause.

The feedback loop is the highest-leverage part of diagnosis. Do not proceed to
root-cause analysis until there is a loop that can reproduce the user's symptom
or a documented reason why no loop is possible.

Try these in roughly this order:

1. Failing test at the seam that reaches the bug.
2. CLI or script invocation with fixture input and expected output.
3. Curl/HTTP request against a running service.
4. Headless browser script with DOM, console, or network assertions.
5. Replay of a captured payload, event, HAR, log, or trace.
6. Throwaway harness that calls the affected code path in isolation.
7. Property/fuzz loop for broad wrong-output symptoms.
8. Bisection or differential loop between known-good and known-bad states.

Improve the loop before investigating:

- Faster: remove unrelated setup and narrow the command.
- Sharper: assert the specific symptom, not merely "does not crash".
- More deterministic: pin time, seed randomness, isolate filesystem/network, or
  raise intermittent reproduction frequency with stress runs.

Do not treat Stage 0 as warm-up. It is the main leverage point. A bad loop
produces fake certainty, weak hypotheses, and symptom-only fixes.

If no credible loop can be built, stop and report what was tried. Ask for access
to the reproducing environment, a captured artifact, or permission to add
temporary instrumentation. Do not proceed on a vibe.

## Stage 1 — Investigation

**Agent:** `diagnostics` (Investigation role)
**Goal:** Produce ranked falsifiable root-cause hypotheses backed by empirical evidence.

### Inputs
- Symptom description (from user or TODO.md bug ID)
- Reproduction steps (or "cannot reproduce" + environment)
- Relevant log lines, stack traces, error IDs
- Feedback loop command/check from Stage 0, or a documented reason no loop can
  currently be built

### Required output — `investigation.json`
```json
{
  "task_id": "BUG-417",
  "symptom": "POST /api/orders returns 500 when cart has ≥10 items",
  "reproduction": {
    "steps": ["...", "..."],
    "frequency": "100% | intermittent (~30%) | once",
    "environment": "staging-eu-west-1"
  },
  "feedback_loop": {
    "command": "npm test -- checkout.e2e.test.ts",
    "signal": "Fails with timeout before hydration marker appears",
    "reliable": true
  },
  "ranked_hypotheses": [
    {
      "rank": 1,
      "cause": "Test clicks #submit before React hydration completes on slow CI runners",
      "prediction": "Waiting for the hydration marker will make the failure disappear without adding a fixed sleep"
    },
    {
      "rank": 2,
      "cause": "Submit button selector matches a hidden stale node",
      "prediction": "Asserting the visible button count will expose multiple matching nodes"
    }
  ],
  "hypothesis": {
    "root_cause": "OrderService.calculateTotal() N+1 query exhausts pool when cart.items.length > 9",
    "confidence": "high | medium | low",
    "falsifiable_by": "Run with pool_size=50; if error disappears, cause confirmed"
  },
  "evidence": [
    {"type": "log", "ref": ".investigations/BUG-417/pg-pool-exhausted.log", "summary": "..."},
    {"type": "code", "ref": "src/services/order.service.ts:142", "summary": "Unbounded .map+await"}
  ],
  "unknowns": ["Why only eu-west-1?", "When did this start?"],
  "next_agent": "diagnostics",
  "next_stage": "verification"
}
```

### Quality gate (Lead Programmer rejects if):
- `feedback_loop` is missing and no blocked-loop explanation exists
- `feedback_loop.signal` is vague, broad, or does not isolate the user's symptom
- `ranked_hypotheses` has fewer than 3 items unless the evidence makes a single
  cause unavoidable
- Any hypothesis lacks a falsifiable prediction
- `hypothesis.falsifiable_by` is vague ("check if it works")
- `evidence` has fewer than 2 items (unverifiable)
- `unknowns` is empty but `confidence: low` (contradictory)

## Stage 2 — Verification

**Agent:** `diagnostics` (Verification role)
**Goal:** Attempt to **refute** the hypothesis. Only confirmed if refutation fails.

### Inputs
- `investigation.json` (from Stage 1)
- Access to staging/test environment
- The Stage 0 feedback loop, rerun before and after each meaningful probe

Verification i