after-action-report
This Claude Code skill conducts structured after-action reviews on incidents, launches, and completed projects using a blameless analysis framework. Use it after any incident occurs, following a major launch, at project closeout, or when recurring issues warrant investigation. The skill produces a six-section report covering summary, timeline, root cause analysis, contributing factors, impact assessment, and actionable lessons learned focused on systems rather than individual blame.
git clone --depth 1 https://github.com/rampstackco/claude-skills /tmp/after-action-report && cp -r /tmp/after-action-report/dist/pi/.agents/skills/after-action-report ~/.claude/skills/after-action-reportSKILL.md
# After-Action Report Run a structured retrospective on a launch, incident, or completed project. Produce actionable lessons, not just a document. This skill is for after-the-fact analysis. For active incident response, use `incident-response`. For planning launches, use `launch-runbook`. --- ## When to use - After any incident (any severity) - After every major launch - At the end of a project (sprint retro, quarterly retro, project closeout) - When a recurring issue has happened enough times to demand investigation - When a decision didn't work out and the team wants to learn ## When NOT to use - During an active incident (use `incident-response`) - For pre-launch planning (use `launch-runbook`) - For one-off bug fixes that don't merit broad analysis --- ## Required inputs - The event being analyzed (incident, launch, project) - A timeline reconstructed from logs, chat, tickets - Participant accounts of what they observed and did - Outcomes and impact (what actually happened to users, the business) --- ## The framework: blameless analysis The most important principle: blameless. Without it, retrospectives produce hidden information and theatrical lessons rather than real ones. ### What blameless means - Focus on systems, not individuals - Assume everyone made reasonable decisions given what they knew at the time - The question is "why was this decision reasonable to make?" not "who screwed up?" - Fixing the system means the next person in that situation succeeds where this person didn't ### What blameless does not mean - No accountability (action items still have owners) - No hard truths (sometimes the system is broken in obvious ways) - No standards (some patterns of failure are individual, not systemic) - No discomfort (real reflection is uncomfortable) --- ## The framework: 6 sections A complete AAR covers six sections. ### 1. Summary A 2 to 3 paragraph overview. Captures: - What happened - Impact (users, business, time) - Root cause (in plain language) - Top action items This is what executives read. Anyone who reads only this section should leave with the most important information. ### 2. Timeline A reconstructed timeline of events. For incidents: - T-0: Detection - T+X: Acknowledgment - T+Y: Severity assessed, IC assigned - T+Z: Investigation began - ... mitigation, communication, resolution events - T+N: Resolution declared For launches: - Pre-launch decisions and milestones - Launch day events - Post-launch monitoring observations For projects: - Major milestones, decisions, pivots - Both planned and emergent The timeline is the source of truth. Disagreements about what happened get resolved here. ### 3. Root cause analysis What caused this, in plain language. Use one or both of: **Five whys.** Start with the surface symptom. Ask "why?" Repeat 5 times (or until you reach a true root). Each "why" should yield a substantive answer, not a tautology. Example: - Why did the site go down? Database connection pool exhausted. - Why was the pool exhausted? Background job opened too many connections. - Why did the background job open too many connections? Connection cleanup code didn't run on errors. - Why didn't cleanup run on errors? Original code review didn't cover error paths. - Why didn't the review cover error paths? No checklist for error handling in our review process. The fifth why often reveals the system fix. In this case: improve the review process. **Causal chain.** Multiple contributing factors that combined. - Factor 1: Background job opened too many connections (technical) - Factor 2: Connection limit was set too low for actual traffic (configuration) - Factor 3: No alert on connection pool saturation (monitoring) - Factor 4: Recent traffic doubled without infra capacity review (process) No single fix addresses the incident. Multiple gaps need attention. ### 4. Contributing factors Factors that didn't cause the event but made it worse, or removed safety nets that would have caught it. - Monitoring gaps - Documentation gaps - Process gaps - Tooling gaps - Knowledge gaps A "would have been caught earlier if..." factor. ### 5. What went well Real lessons require capturing successes, not just failures. - What detection worked? - What response worked? - What decisions were good? - What tools or processes performed as expected? This is not consolation. It's calibration. Things that worked here should be reinforced and replicated. ### 6. Action items Specific, owned, dated. | Action | Owner | Due | Type | |---|---|---|---| | Add alert on connection pool saturation | [name] | [date] | Monitoring | | Add error handling checklist to PR template | [name] | [date] | Process | | Audit other background jobs for similar issue | [name] | [date] | Code | **Action item criteria:** - **Specific.** "Improve monitoring" is not actionable. "Add alert on connection pool saturation, threshold 80%, page on-call" is. - **Owned.** A name. Not "the team." - **Dated.** A real date. Not "soon." - **Sized.** Roughly hours, days, or weeks of effort. - **Closeable.** Definition of done is clear. Action items that don't close in their committed timeframe should re-surface in the next AAR. Patterns of unclosed actions point to deeper organizational issues. --- ## Workflow ### 1. Schedule the AAR Within 1 to 2 weeks of the event. Long enough that emotions cooled and facts gathered. Short enough that memories are fresh. For incidents: pre-decided in the response procedure. For launches: schedule on the runbook. For projects: schedule at project closeout. ### 2. Gather inputs Before the meeting: - Reconstructed timeline (often the scribe's notes if there was one) - Logs, chat transcripts, tickets, incident updates - Individual accounts from each participant (written, before the meeting) - Impact data (users affected, duration, revenue impact, etc.) ### 3. Run the meeting Typical agenda (60 to 90 minutes): - Read the summary as drafted (5 min)
Run a comprehensive WCAG accessibility audit covering perceivable, operable, understandable, and robust principles. Use this skill whenever the user wants to audit accessibility, review WCAG compliance, fix accessibility issues, prepare for accessibility certification, address an accessibility lawsuit risk, or systematically improve a site's accessibility. Triggers on accessibility audit, WCAG audit, a11y audit, accessibility compliance, ADA compliance, screen reader test, keyboard navigation, accessibility report, fix accessibility, axe scan. Also triggers when accessibility issues have been reported and need systematic remediation.
How to produce ad creative that converts at performance scale. Hook patterns, format selection, video pacing, variation systems, sequential testing methodology, fatigue detection, brand-voice alignment without conversion dilution, and platform-specific creative norms. Triggers on ad creative, ad design, hook patterns, ad video pacing, creative testing, ad variations, creative refresh, creative fatigue, refresh ad creative, video ads for Meta, TikTok creative, LinkedIn ad creative, ad asset library. Also triggers when a team is producing creative at scale, planning a creative test cycle, or auditing why creative is not converting.
How to read paid media dashboards without fooling yourself. Attribution models, platform reporting quirks, multi-platform reconciliation, ROAS vs LTV horizon traps, statistical noise in performance metrics, incrementality testing, and the failure modes that produce expensive lessons. Triggers on read paid media dashboard, attribution analysis, ROAS vs LTV, multi-platform reconciliation, ad incrementality, geo holdout, conversion lift study, ghost bidding, paid media reporting, board-deck paid media metrics, blended CAC, MMM, MTA, last-click attribution. Also triggers when a marketer is about to scale, kill, or rebudget a campaign based on platform metrics, or when reconciling platform reports against warehouse revenue.
How humans and AI compose in content workflows. Where AI legitimately participates, where humans must own, hybrid workflow patterns, voice ownership preservation, the AI slop problem, disclosure and transparency, team calibration, and the ethics of intellectually honest AI-assisted content production. Triggers on AI content workflow, AI-assisted writing, hybrid content production, AI in editorial, AI slop, AI disclosure, AI usage policy, AI content ethics, voice preservation with AI, team AI calibration. Also triggers when content feels generic despite quality tools, when team AI usage has drifted into inconsistency, or when a regulated or trust-sensitive context requires explicit AI policy.
Design measurement frameworks including event taxonomy, KPI hierarchy, dashboard architecture, attribution models, and analytics implementation strategy. Use this skill whenever the user wants to plan analytics, design dashboards, build event taxonomies, define KPIs, set up tracking, or audit existing measurement. Triggers on analytics strategy, measurement plan, event taxonomy, tracking plan, KPI framework, dashboard design, north star metric, attribution model, conversion tracking, GA4 setup, Mixpanel setup, analytics audit. Also triggers when the user has data but no clear way to use it, or wants to make decisions but doesn't know what to track.
Direct visual and creative work for campaigns, photography, illustration, video, and branded experiences. Use this skill whenever the user wants to brief a photographer, direct illustrators, plan a creative campaign, develop visual concepts, write a creative direction document, or evaluate creative work for fit. Triggers on art direction, photo brief, photography brief, illustration brief, campaign concept, creative concept, visual direction, mood board, look and feel, visual treatment, video direction. Also triggers when the user has approved brand identity but needs to extend it into specific creative deliverables.
Plan and run backups, set recovery objectives, and run disaster recovery drills. Use this skill when defining RPO/RTO targets, designing backup architecture, deciding what to back up and how often, planning for full-region or platform outages, or running a restoration drill. Triggers on backup, restore, RPO, RTO, disaster recovery, DR, business continuity, what if the database is gone, what if our hosting goes down, recovery drill, ransomware planning. Also triggers when an incident reveals a gap in restoration capability.
Running closed and open betas that produce real signal. Beta participant selection, structured feedback collection, beta-to-GA decision criteria, and the difference between soft-launch (no structure, no signal), kitchen-sink (everyone in, no actionable feedback), and structured beta (calibrated cohort, intentional feedback loops, clear graduation criteria). Triggers on beta program, alpha test, beta cohort, beta participant, beta feedback, beta to GA decision, design partner, early access program, closed beta, open beta, RC release. Also triggers when a feature is approaching launch and the team needs structured pre-GA validation, when prior betas produced noise rather than signal, or when the team has soft-launched before but wants more structured feedback this time.