Install in Claude Code
Copygit clone --depth 1 https://github.com/TerminalSkills/skills /tmp/api-load-tester && cp -r /tmp/api-load-tester/skills/api-load-tester ~/.claude/skills/api-load-testerThen start a new Claude Code session; the skill loads automatically.
Definition
SKILL.md
# API Load Tester
## Overview
This skill generates realistic load test scripts from API definitions and executes them with proper ramp-up patterns, authentication flows, and assertions. It produces clear reports identifying breaking points, bottlenecks, and latency percentiles at each traffic level.
## Instructions
### Step 1: Choose Tool and Gather API Info
Prefer k6 for complex scenarios (multi-step flows, thresholds, custom metrics). Use wrk for quick single-endpoint benchmarks. Use autocannon if only Node.js is available.
Gather endpoint information from:
- OpenAPI/Swagger spec files
- Route definitions (Express, FastAPI, etc.)
- User-described endpoints
### Step 2: Generate Realistic Payloads
Read request/response types from the codebase (TypeScript interfaces, Python dataclasses, Go structs) and generate payloads with:
- Realistic field values (not "test123" or "foo")
- Proper data distributions (varied product IDs, realistic quantities)
- Edge cases mixed in (long strings, special characters at ~5% rate)
### Step 3: Design Test Scenarios
Create scenarios appropriate for the goal:
**Ramp-up test** (finding breaking point):
```
stages: [
{ duration: '2m', target: 50 }, // warm-up
{ duration: '5m', target: 200 }, // ramp
{ duration: '3m', target: 500 }, // push
{ duration: '2m', target: 500 }, // sustain
{ duration: '2m', target: 0 }, // cool-down
]
```
**Soak test** (finding memory leaks, connection exhaustion):
```
stages: [
{ duration: '5m', target: 100 }, // ramp
{ duration: '60m', target: 100 }, // sustain
{ duration: '5m', target: 0 }, // cool-down
]
```
**Spike test** (sudden traffic burst):
```
stages: [
{ duration: '2m', target: 50 }, // normal
{ duration: '30s', target: 500 }, // spike
{ duration: '5m', target: 500 }, // sustain spike
{ duration: '30s', target: 50 }, // drop back
]
```
### Step 4: Include Proper Assertions
Always add thresholds:
```javascript
thresholds: {
http_req_duration: ['p(95)<800', 'p(99)<2000'],
http_req_failed: ['rate<0.05'],
// Custom per-endpoint if needed
}
```
### Step 5: Execute and Report
Run the test and parse output into a structured report showing:
- Latency percentiles (p50, p95, p99) per stage
- Error rate per stage
- Requests per second achieved
- Breaking point (where thresholds first violated)
- Bottleneck hypothesis based on error patterns
## Examples
### Example 1: Quick Endpoint Benchmark
**User prompt:**
```
Benchmark our /api/search endpoint handling 100 concurrent connections
```
**Agent output:**
```bash
# Quick benchmark with k6
k6 run --vus 100 --duration 60s - <<'EOF'
import http from 'k6/http';
import { check } from 'k6';
export default function () {
const res = http.get('https://api.example.com/api/search?q=laptop&page=1');
check(res, { 'status 200': (r) => r.status === 200 });
}
EOF
```
```
Results — /api/search @ 100 VUs for 60s
Requests: 8,423 total (140.4 RPS)
Latency: p50=89ms p95=340ms p99=890ms
Errors: 12 (0.14%)
Data: 24.3 MB received
Verdict: Endpoint handles 140 RPS at 100 VUs. p99 approaching 1s suggests
database query optimization needed for sustained higher load.
```
### Example 2: Multi-Step Flow
**User prompt:**
```
Create a load test for our signup → login → create-project flow
```
**Agent generates a k6 script with:**
- Step 1: POST /api/auth/signup with randomized email/name
- Step 2: POST /api/auth/login to get JWT
- Step 3: POST /api/projects with auth header and realistic project data
- Custom metrics tracking each step's latency separately
- Sleep between steps to simulate real user behavior
## Guidelines
- **Never load test production without explicit confirmation** — always clarify the target environment
- **Start low, ramp gradually** — sudden jumps make it hard to identify the exact breaking point
- **Realistic think time** — add `sleep(1-3)` between requests to simulate real users; without it, you're testing throughput, not user concurrency
- **Authentication matters** — many bottlenecks only appear with real auth flows (token validation, session lookups)
- **Watch for connection reuse** — k6 reuses connections by default, which is realistic for browsers but not for serverless/mobile clients
- **Rate limit awareness** — if the API has rate limiting, note it in the report; it's not a performance bottleneck, it's intentional
- **Report infrastructure context** — always note the server specs, pod count, and database size alongside resultsMore from this repository
PULL_REQUEST_TEMPLATESkill
3dsmax-renderingSkill
>-
3dsmax-scriptingSkill
>-
3proxySkill
>-
a2a-protocolSkill
>-
ab-test-setupSkill
When the user wants to plan, design, or implement an A/B test or experiment. Also use when the user mentions "A/B test," "split test," "experiment," "test this change," "variant copy," "multivariate test," or "hypothesis." For tracking implementation, see analytics-tracking.
ablySkill
>-
accessibility-auditorSkill
>-