Skill537 estrellas del repoactualizado 28d ago

incident-postmortem

This skill drafts a blameless postmortem document after an incident is resolved by synthesizing timeline notes, alerts, and chat records into a structured markdown report. Use it after customer-impacting incidents, near-misses, or exercises have concluded to document impact, root cause, contributing factors, and action items with clear ownership and deadlines, never during active incident response.

Ver fuente Repositorio: LLM-Agents-Ecosystem-Handbook

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/oxbshw/LLM-Agents-Ecosystem-Handbook /tmp/incident-postmortem && cp -r /tmp/incident-postmortem/skills/catalog/incident-postmortem ~/.claude/skills/incident-postmortem

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

# Incident Postmortem

## When to use
- After an incident with customer impact
- After a near-miss worth documenting
- For drills / GameDay exercises

## When NOT to use
- Live incidents (use `incident-runbook` instead)
- Pure private learnings (use a personal log)

## Inputs
| Name | Type | Required | Notes |
|---|---|---|---|
| `timeline` | text or paths | yes | Source notes, alert links, chat extracts |
| `severity` | `S0`–`S3` | yes | Severity at peak |
| `incident_id` | string | yes | e.g. `INC-2026-04-12-001` |

## Outputs
`postmortem.md` with: Summary, Impact, Timeline, Direct cause, Contributing factors, What went well, What didn't, Action items.

## Workflow
1. Reconstruct the timeline (UTC), one row per notable event
2. State **impact** in user-facing terms (requests failed, dollars lost, customers affected)
3. Identify the **direct cause** in one sentence — what specifically broke
4. List **contributing factors** (the conditions that made the incident possible / hard to mitigate)
5. Separate **what went well** (genuine wins, not platitudes) and **what didn't**
6. Owners + dates on every action item; hard fail if missing
7. Tone: blameless. Describe systems failing, not people.

## References
- [`references/postmortem-template.md`](references/postmortem-template.md)

## Success criteria
- Every action item has an owner and a date
- Direct cause is one sentence
- No names attached to fault; only roles / systems
- Timeline times in UTC

## Failure modes
- Missing source data → list gaps explicitly; don't fill with guesses
- Conflicting timelines → record both, flag for review

Del mismo repositorio

New SkillSkill

adr-writerSkill

Use when capturing an architecture decision so it survives turnover — produces an ADR-NNNN.md from context, options considered, and the chosen path.

api-design-reviewerSkill

Use when reviewing a proposed REST or GraphQL API change before merge — checks contract clarity, backwards compatibility, errors, pagination, auth, and naming.

dataset-profilerSkill

Use when first encountering a new dataset — produces a structured profile (schema, missingness, distributions, outliers, gotchas) before any analysis.

pr-summarizerSkill

Use when opening a PR — produces a clean PR description (what / why / how to verify / risks) from a branch diff against base.

sprint-plannerSkill

Use when planning the next sprint — turns ticket intake + team capacity into a planned sprint with explicit non-goals.

agent-memory-curatorSkill

Use after a session to promote useful episodic notes from logs/episodic/ into distilled, dated entries in MEMORY.md and memory/semantic/.

mcp-security-reviewerSkill

Use before connecting a new MCP server to your agent — produces a structured security review covering source, permissions, tools, network, and approvals.