Skill1.2k repo starsupdated today

capacity-planning

The capacity-planning skill generates a comprehensive infrastructure capacity document for a service, including current baseline metrics, traffic and growth projections, resource utilization analysis, and a prioritized action roadmap. Use it when planning infrastructure scaling, forecasting resource constraints, modeling traffic growth scenarios, defining scaling thresholds, or conducting capacity reviews for services where understanding headroom and preventing constraint-driven incidents is critical to reliability.

View source Repository: pm-claude-skills

Install in Claude Code

Copy

git clone --depth 1 https://github.com/mohitagw15856/pm-claude-skills /tmp/capacity-planning && cp -r /tmp/capacity-planning/plugins/pm-engineering/skills/capacity-planning ~/.claude/skills/capacity-planning

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# Capacity Planning Skill

Produce a complete capacity planning document for a service. Capacity planning is not about predicting the future exactly — it is about understanding current headroom, modelling growth, and ensuring the team takes infrastructure action before a constraint becomes an incident.

A good capacity plan answers: what is running out first, how long before it runs out, what does it cost to fix it, and who decides when to act.

## Required Inputs

Ask for these if not already provided:
- **Service name and description** — what the service does and who depends on it
- **Current traffic and usage metrics** — requests per second (or per day), active users, data volume — whatever units are most natural for this service
- **Current resource utilisation** — CPU %, memory %, disk usage, connection pool utilisation, DB query throughput
- **Growth rate or projections** — historical growth rate, or known upcoming events (product launch, sales cycle, seasonal peak)
- **Tech stack and infrastructure** — cloud provider, compute type (VMs, containers, serverless), database, caching layer, CDN
- **Cost constraints** — current infrastructure spend, acceptable cost ceiling, or target cost per unit of traffic

## Output Format

---

# Capacity Plan: [Service Name]

**Service:** [Name] | **Team:** [Team name]
**Author:** [Name] | **Last updated:** [Date]
**Planning horizon:** [12 months — [Month Year] to [Month Year]]
**Review cadence:** [Quarterly]

---

## 1. Executive Summary

[3–5 sentences covering: current state, the most critical capacity constraint, the timeline before it becomes a risk, the recommended action, and the cost implication. Written for an engineering manager or VP who needs the key facts without reading the full document.]

**Critical finding:** [e.g. "The database connection pool will reach 90% utilisation within 6 weeks at current growth. Without action, this will cause request queueing and latency spikes under normal traffic."]

**Recommended immediate action:** [e.g. "Increase connection pool limit and add a read replica within the next 2 weeks."]

**Estimated cost impact:** [e.g. "Recommended changes add ~$[X]/month to infrastructure spend."]

---

## 2. Current Baseline

*All metrics are 30-day averages unless noted. Date captured: [Date]*

### Traffic

| Metric | Value | Peak (7-day) | Notes |
|---|---|---|---|
| Requests per second (avg) | [X req/s] | [X req/s] | [Peak time / day of week] |
| Requests per day | [X M/day] | [X M/day] | — |
| Active users (DAU/MAU) | [X] / [X] | — | — |
| [Service-specific metric — e.g. jobs processed/hour] | [X] | [X] | — |
| [Service-specific metric — e.g. GB ingested/day] | [X GB] | [X GB] | — |

### Compute

| Resource | Current utilisation | Instance type | Count | Notes |
|---|---|---|---|---|
| CPU (avg) | [X%] | [e.g. c5.2xlarge] | [X] | Peak: [X%] |
| Memory (avg) | [X%] | — | — | Peak: [X%] |
| Network egress | [X Mbps] | — | — | — |
| Container / pod count | [X] | [e.g. 2 vCPU / 4 GB] | — | Auto-scaling range: [X–Y] |

### Database

| Resource | Current utilisation | Spec | Notes |
|---|---|---|---|
| CPU | [X%] | [e.g. db.r5.2xlarge] | Peak: [X%] |
| Memory | [X%] | [X GB RAM] | — |
| Storage used | [X GB] of [Y GB] ([Z%]) | [X GB provisioned] | Growth: [~X GB/month] |
| IOPS (avg) | [X] of [Y provisioned] | [Y IOPS] | Peak: [X IOPS] |
| Connection pool | [X] of [Y max] ([Z%]) | Max connections: [Y] | [ORM pool size: X] |
| Query P99 latency | [X ms] | — | [Slowest query: X] |
| Read/write ratio | [X%] reads / [Y%] writes | — | — |

### Cache

| Resource | Current utilisation | Spec | Notes |
|---|---|---|---|
| Memory used | [X GB] of [Y GB] ([Z%]) | [e.g. cache.r6g.large] | Eviction rate: [X%] |
| Hit rate | [X%] | — | Miss rate: [Y%] |
| Connections | [X] | Max: [Y] | — |

### Storage / Object Store

| Resource | Current usage | Growth rate | Notes |
|---|---|---|---|
| [S3 / GCS / Blob] | [X GB / TB] | [~X GB/month] | [Lifecycle policies in place? Y/N] |
| Disk (if applicable) | [X GB] of [Y GB] | [~X GB/month] | [RAID / EBS type] |

### Cost Baseline

| Component | Current monthly cost | % of total |
|---|---|---|
| Compute (app servers) | $[X] | [X%] |
| Database | $[X] | [X%] |
| Cache | $[X] | [X%] |
| Storage | $[X] | [X%] |
| CDN / bandwidth | $[X] | [X%] |
| Other ([describe]) | $[X] | [X%] |
| **Total** | **$[X]** | 100% |

**Unit economics:** $[X] per [1,000 requests / 1,000 users / GB processed]

---

## 3. Growth Projections

### Assumptions

| Assumption | Value | Source | Confidence |
|---|---|---|---|
| Monthly traffic growth rate | [X%] | [Historical trend / product forecast] | [High / Medium / Low] |
| Seasonal peak factor | [+X% in [month(s)]] | [Last year's data / expected launch] | [High / Medium] |
| Upcoming events | [e.g. Marketing campaign — [Month], expected +[X]% traffic spike] | [Marketing plan] | [Medium] |
| User growth | [X new users/month] | [Sales pipeline / growth model] | [Medium] |
| Data growth | [X GB/month] | [Current trend] | [High] |

### Traffic Forecast

| Timeframe | Req/s (avg) | Req/s (peak) | DAU | Data volume (cumulative) |
|---|---|---|---|---|
| **Now** (baseline) | [X] | [X] | [X] | [X GB/TB] |
| **+3 months** | [X] | [X] | [X] | [X GB/TB] |
| **+6 months** | [X] | [X] | [X] | [X GB/TB] |
| **+12 months** | [X] | [X] | [X] | [X GB/TB] |

*Growth formula: [Baseline] × (1 + [monthly rate])^[months] + seasonal adjustment*

### Capacity Headroom Analysis

**When does each resource run out at current utilisation and projected growth?**

| Resource | Current utilisation | Safe ceiling | Headroom remaining | Months to ceiling |
|---|---|---|---|---|
| App CPU | [X%] | 70% | [X%] | [X months] |
| App memory | [X%] | 80% | [X%] | [X months] |
| DB CPU | [X%] | 70% | [X%] | [X months] |
| DB storage | [X GB] of [Y GB] | 80% = [Z GB] | [X GB] | [X months] |
| DB IOPS | [X] of [Y] | 80% = [Z] | [X IOPS] | [X months] |
| DB connections | [X] of [Y] | 80% = [Z] | [X] | [X month