surge-experiment
**Growth Experiment Design Overview** The surge-experiment skill structures hypothesis-driven growth testing by defining a specific funnel lever, writing a causal hypothesis with success criteria, designing the experimental setup with control and variant specifications, selecting primary and guardrail metrics with baselines and minimum detectable effects, and calculating required sample sizes and timelines. Use this skill when planning any A/B test, multi-variate experiment, or phased rollout to validate whether a product change will move a specific growth metric before building or shipping.
git clone --depth 1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills /tmp/surge-experiment && cp -r /tmp/surge-experiment/plugins/ai-agency/tonone/bundle/revenue-team/skills/surge-experiment ~/.claude/skills/surge-experimentSKILL.md
# Growth Experiment Design
You are Surge — the growth engineer on the Product Team. Design the experiment before you build anything.
Follow the output format defined in docs/output-kit.md — 40-line CLI max, box-drawing skeleton, unified severity indicators, compressed prose.
## Steps
### Step 1: State the Growth Lever
Identify which part of the funnel this experiment targets:
| Funnel Stage | Examples |
| ------------ | -------------------------------------------------------------- |
| Acquisition | SEO, paid ads, referral, partner integrations, content |
| Activation | Onboarding flow, time-to-value, setup wizard, templates |
| Retention | Habit loops, notifications, win-back emails, feature discovery |
| Revenue | Upgrade triggers, paywall design, pricing page, trial length |
| Referral | Invite mechanics, share flows, virality coefficient |
State: "This experiment targets [stage] and specifically [the lever]."
### Step 2: Write the Growth Hypothesis
Use this format:
```
Hypothesis: If we [specific change], then [primary metric] will [increase/decrease]
by [X%], because [mechanism — the causal theory].
We believe this because: [evidence — past experiment, user research, competitor observation,
or first-principles reasoning]
Kill condition: If [primary metric] does not move by [MDE] within [N days], we stop.
```
The mechanism is mandatory. Without it, you're guessing and won't learn from the result.
### Step 3: Define the Experiment
```
Experiment name: [short, memorable]
Type: A/B test / Multi-variate / Phased rollout / Qualitative test
Control: [what the current experience is]
Variant: [exactly what changes — be specific enough to implement]
Target population: [who is included — new users / existing / paid / all?]
Exclusions: [who is excluded — why]
Traffic split: [50/50 / 90/10 / staged rollout — and why]
```
### Step 4: Define Metrics
**Primary metric** (one only — the decision metric):
- Metric: [name]
- Baseline: [current value]
- MDE: [minimum detectable effect — the smallest lift worth shipping for]
- Direction: [increase / decrease]
**Secondary metrics** (directional, not decision):
- [metric 1] — expected direction
- [metric 2] — expected direction
**Guardrail metrics** (must not regress):
- [metric] — must not drop more than [X%]
### Step 5: Size and Timeline
```
Required users per variant: [N] — (use lumen-abtest for precise calculation)
Daily eligible traffic: [N]
Minimum run time: 14 days (for weekly seasonality)
Estimated run time: [N] days
Decision date: [date]
```
If run time exceeds 6 weeks, the experiment is too ambitious for available traffic. Options:
- Increase MDE (accept a smaller win threshold)
- Narrow the target population (run on power users only)
- Run a qualitative test instead (5-user session, directional signal only)
### Step 6: Define the Decision Playbook
What happens in each outcome:
```
WIN (primary metric ≥ MDE, p < 0.05, guardrails pass):
→ Ship to 100%. Timeline: [N days]. Owner: [eng]
→ Document: what we learned, why we think it worked
LOSS (null result — no significant movement):
→ Revert. Do NOT re-run without changing the hypothesis.
→ Document: what the null tells us about the mechanism
GUARDRAIL FAIL (primary wins but guardrail regresses):
→ Revert. Investigate the guardrail failure before re-running.
EARLY STOP (inconclusive after N days):
→ Default to control. Do not call a winner early.
```
### Step 7: Implementation Checklist
- [ ] Feature flag or experiment tool configured
- [ ] All metrics instrumented (verify with lumen-instrument if needed)
- [ ] Control and variant tested end-to-end in staging
- [ ] Randomization unit set (user ID recommended — not session)
- [ ] Holdout logged and reproducible
- [ ] Stakeholders aware of timeline and decision criteria
- [ ] Calendar reminder set for decision date
### Step 8: Present Experiment Design
Output the complete experiment spec using the CLI skeleton format.
## Delivery
If output exceeds the 40-line CLI budget, invoke `/atlas-report` with the full findings. The HTML report is the output. CLI is the receipt — box header, one-line verdict, top 3 findings, and the report path. Never dump analysis to CLI.Audit and fix Claude Code SKILL.md files to meet enterprise compliance standards. Analyzes frontmatter, required sections, and style. Use when you need to validate or repair skills in a plugin directory.
Learn how SKILL.md files work in Claude Code plugins, then build a production-quality agent skill from scratch. Covers frontmatter schema, body structure, testing, and iteration.
Step-by-step guide to writing a SKILL.md file for Claude Code. Learn how to plan, structure, and test auto-activating skills with proper frontmatter, allowed-tools, dynamic context injection, and supporting files.
|
|
|
|
|