benchmark-methodology
This Claude Code skill converts a ranked competitor set into standardized, evidence-based scores across nine weighted dimensions, each with explicit 1–5 rubrics. Use it after establishing the client's strategic positioning brief (defining their target white-space axes and core differentiator) and when ready to produce comparable profile cards that anchor competitive positioning arguments in defensible, consistent criteria rather than subjective ranking.
git clone --depth 1 https://github.com/affaan-m/ECC /tmp/benchmark-methodology && cp -r /tmp/benchmark-methodology/.agents/skills/benchmark-methodology ~/.claude/skills/benchmark-methodologySKILL.md
# Benchmark Methodology
Use this skill to turn a scoped competitor set into **comparable, defensible
scores**. Each competitor is assessed on the same nine dimensions, with
explicit 1–5 rubrics, then captured in a uniform profile card. Consistency is
the point: scores are only useful if the same evidence would earn the same
number for any competitor.
## When to Activate
- A scoped, tiered competitor set from competitive-platform-analysis is ready to score.
- Need comparable, evidence-anchored scores across competitors — not gut-feel rankings.
- Client's strategic tension (the paired axes defining their target white-space) has been established.
- Preparing to produce profile cards for assembly in competitive-report-structure.
## Client positioning brief (establish first)
Before scoring, establish the client's positioning brief. It supplies:
- **Strategic tension** — the two axes (e.g., memorability × hireability) whose
intersection marks the client's target white-space. Dimension 9 is always
the client's named tension; report both poles separately, never averaged.
- **Differentiator** — what makes the client's moat. This informs which
dimensions matter most for the client's positioning argument.
- **Brand balance** — the intended mix of distinct strategic emphases. Strategic
recommendations must not break this balance without flagging it.
## Why these dimensions
The client competes on a **specific tension held across two poles**, not on
service breadth. The dimensions are weighted to reflect that moat. Two
dimensions — the tension poles — are scored **separately and never averaged
together**, because the client's strategic question is precisely whether a rival
achieves both simultaneously.
## The nine dimensions (with weights)
Weights guide synthesis emphasis, not a single blended score (avoid a false
composite — see Bias controls). Sum = 100%.
1. **Positioning clarity & distinctiveness** (18%) — Is the studio's position
sharp, ownable, and instantly legible? Or generic?
2. **Brand voice / verbal distinctiveness** (15%) — Does the copy have an
ownable register, or is it interchangeable agency-speak?
3. **Visual identity & site craft** (15%) — Quality and ownership of the visual
system; site as proof-of-craft.
4. **Service offer & packaging** (12%) — Productized and legible (named
sprints/audits) vs vague. Packaging maturity.
5. **Evidence & credibility** (12%) — Named clients, quantified outcomes,
case-study depth. Proof beyond assertion.
6. **Enterprise-readiness / commercial maturity** (10%) — Signals they can land
and hold SaaS/fintech/B2B/enterprise work (process, logos, scale, contracts).
7. **Thought leadership / content presence** (8%) — Owned POV: writing, talks,
newsletters, frameworks. Depth over volume.
8. **Pricing transparency & engagement model** (5%) — Is pricing/engagement
legible? Productized vs bespoke vs opaque.
9. **[Client's strategic tension]** (5% as a flag; **score BOTH poles,
report separately**) — Read the tension name and axis descriptions from the
client's positioning brief. Plot both; the gap is the insight. The client's
target quadrant is the single most important finding: who else is already
there?
## Scoring rubric (1–5, applies to dimensions 1–8)
Anchor every score to observable evidence. Generic descriptors below; adapt the
specifics per dimension but keep the level meaning constant.
- **1 — Absent / generic.** No discernible position or craft; indistinguishable
from a template. Active liability.
- **2 — Below par.** Some intent but inconsistent, derivative, or unconvincing.
Wouldn't survive a side-by-side.
- **3 — Competent / table-stakes.** Solid, professional, unremarkable. Meets
expectation, ownable by nobody.
- **4 — Strong / distinctive.** Clearly above peers; a real strength a buyer
would notice and cite.
- **5 — Category-defining.** Best-in-class, ownable, hard to imitate. Sets the
bar others react to.
### Tension axes (dimension 9) — score each 1–5
Read the axis labels and their 1/3/5 anchors from the client's positioning
brief. Example anchors for a memorability × credibility tension:
- **Memorability** — 1: forgotten instantly · 3: recognizable in context ·
5: unforgettable, talked-about, distinctively owned.
- **Credibility** — 1: feels risky/amateur · 3: safe, competent,
unexciting · 5: enterprise-trusted, obvious safe choice.
Plot competitors on the tension 2×2. The client's target quadrant is named in
the positioning brief. Who else occupies that quadrant is the single most
important finding of the benchmark.
## How to collect the data
For each competitor, work the dimensions in this order (cheapest signal first):
1. **Competitor's own site** — positioning, voice, offer packaging, pricing
posture, named clients, manifesto/POV. Screenshot the homepage + one case
study.
2. **Case studies / work** — evidence depth, quantified outcomes, client names.
Distinguish *asserted* ("we delivered X") from *proven* (metrics, named,
verifiable).
3. **Review directories** — corroborate clients, project size, engagement model
→ credibility & enterprise-readiness (e.g. Clutch.co or the niche equivalent).
4. **LinkedIn** — team size/model, founder narrative, content cadence →
thought leadership, model.
5. **Portfolio / craft platforms** — craft register (use the showcase native to
the niche: design boards, showreels, published samples, etc.).
6. **Content channels** — newsletter/talks/writing → thought-leadership depth.
**What to record per dimension:** the score, one-line justification, and the
source link/screenshot that earned it. No score without evidence.
## Bias controls
- **No single composite score.** Report dimension scores and the tension plot
separately. A weighted average hides the asymmetry that matters.
- **Asserted vs proven.** Downgrade credibility/evidence scores for
self-reported claims with no corroboration. Site copy is marketing, noStructured self-debugging workflow for AI agent failures using capture, diagnosis, contained recovery, and introspection reports.
Build an evidence-backed ECC install plan for a specific repo by sorting skills, commands, rules, hooks, and extras into DAILY vs LIBRARY buckets using parallel repo-aware review passes. Use when ECC should be trimmed to what a project actually needs instead of loading the full bundle.
>
Write articles, guides, blog posts, tutorials, newsletter issues, and other long-form content in a distinctive voice derived from supplied examples or brand guidance. Use when the user wants polished written content longer than a paragraph, especially when voice consistency, structure, and credibility matter.
>
Build a source-derived writing style profile from real posts, essays, launch notes, docs, or site copy, then reuse that profile across content, outreach, and social workflows. Use when the user wants voice consistency without generic AI writing tropes.
Bun as runtime, package manager, bundler, and test runner. When to choose Bun vs Node, migration notes, and Vercel support.
>