seo-cluster
The seo-cluster skill groups search keywords by their actual Google search results overlap rather than text similarity, then designs hub-and-spoke content architectures with internal linking strategies. Use it to plan SEO content clusters around seed topics, execute multi-page content creation workflows, or regenerate interactive visualizations of how keywords should be organized for search visibility.
git clone --depth 1 https://github.com/AgriciDaniel/codex-seo /tmp/seo-cluster && cp -r /tmp/seo-cluster/skills/seo-cluster ~/.claude/skills/seo-clusterSKILL.md
# Semantic Topic Clustering (v1.9.0)
## Shared Data Cache
**Step 0 -- Check shared data cache:**
Before gathering, check `.seo-cache/` for reusable context from related SEO skills.
Reference: `../seo/references/shared-data-cache.md` for schemas and dependency map.
Check these cache files when present:
- `.seo-cache/site-meta.json` for domain, business type, industry, and crawl context
- `.seo-cache/audit-scores.json` for prior full-audit priorities
- `.seo-cache/pages/{url-slug}/page-analysis.json` for page-level context when a URL is provided
- If found: parse and use clearly valid fields (note "Using cached [X] from [date]")
- If missing, corrupt, or irrelevant: continue with fresh evidence
- If the user says "refresh" or "re-run": ignore cache reads and overwrite on write
SERP-overlap-driven keyword clustering for content architecture. Groups keywords
by how Google actually ranks them (shared top-10 results), not by text similarity.
Designs hub-and-spoke content clusters with internal link matrices and generates
interactive cluster map visualizations.
**Scripts:** Located at the plugin root `scripts/` directory.
---
## Quick Reference
| Command | What it does |
|---------|-------------|
| `/seo cluster plan <seed-keyword>` | Full planning workflow: expand, cluster, architect, visualize |
| `/seo cluster plan --from strategy` | Import from existing `/seo plan` output |
| `/seo cluster execute` | Execute plan: create content via codex-blog or output briefs |
| `/seo cluster map` | Regenerate the interactive cluster visualization |
---
## Planning Workflow
### Step 1: Seed Keyword Expansion
Expand the seed keyword into 30-50 variants using WebSearch:
1. **Related searches** — Search the seed, extract "related searches" and "people also search for"
2. **People Also Ask (PAA)** — Extract all PAA questions from SERP results
3. **Long-tail modifiers** — Append common modifiers: "best", "how to", "vs", "for beginners", "tools", "examples", "guide", "template", "mistakes", "checklist"
4. **Question mining** — Generate who/what/when/where/why/how variants
5. **Intent modifiers** — Add commercial modifiers: "pricing", "review", "alternative", "comparison", "free", "top"
**Deduplication:** Normalize variants (lowercase, strip articles), remove exact duplicates.
Target: 30-50 unique keyword variants. If under 30, run a second expansion pass
with the top PAA questions as seeds.
### Step 2: SERP Overlap Clustering
This is the core differentiator. Load `references/serp-overlap-methodology.md` for
the full algorithm.
**Process:**
1. Group keywords by initial intent guess (reduces pairwise comparisons)
2. For each candidate pair within a group, WebSearch both keywords
3. Count shared URLs in the top 10 organic results (ignore ads, featured snippets, PAA)
4. Apply thresholds:
| Shared Results | Relationship | Action |
|---------------|-------------|--------|
| 7-10 | Same post | Merge into single target page |
| 4-6 | Same cluster | Group under same spoke cluster |
| 2-3 | Interlink | Place in adjacent clusters, add cross-links |
| 0-1 | Separate | Assign to different clusters or exclude |
**Optimization:** With 40 keywords, full pairwise = 780 comparisons. Instead:
- Pre-group by intent (4 groups of ~10 = 4 x 45 = 180 comparisons)
- Only cross-check group boundary keywords
- Skip pairs where both are long-tail variants of the same head term (assume same cluster)
**DataForSEO integration:** If DataForSEO MCP is available, use `serp_organic_live_advanced`
instead of WebSearch for SERP data. Run `python scripts/dataforseo_costs.py check serp_organic_live_advanced --count N`
before each batch. If `"status": "needs_approval"`, show cost estimate and ask user.
If `"status": "blocked"`, fall back to WebSearch.
### Step 3: Intent Classification
Classify each keyword into one of four intent categories:
| Intent | Signals | Include in Clusters? |
|--------|---------|---------------------|
| Informational | how, what, why, guide, tutorial, learn | Yes |
| Commercial | best, top, review, comparison, vs, alternative | Yes |
| Transactional | buy, price, discount, coupon, order, sign up | Yes |
| Navigational | brand names, specific product names, login | No (exclude) |
Remove navigational keywords from clustering. Flag borderline cases for
manual review. Keywords can have mixed intent (e.g., "best CRM software" is
both commercial and informational) -- classify by dominant intent.
### Step 4: Hub-and-Spoke Architecture
Load `references/hub-spoke-architecture.md` for full specifications.
**Design the cluster structure:**
1. **Select the pillar keyword** — Highest volume, broadest intent, most SERP overlap with other keywords
2. **Group spokes into clusters** — Each cluster is a subtopic area (2-5 clusters per pillar)
3. **Assign posts to clusters** — Each cluster gets 2-4 spoke posts
4. **Select templates per post** — Based on intent classification:
| Intent Pattern | Template Options |
|---------------|-----------------|
| Informational (broad) | ultimate-guide |
| Informational (how) | how-to |
| Informational (list) | listicle |
| Informational (concept) | explainer |
| Commercial (compare) | comparison |
| Commercial (evaluate) | review |
| Commercial (rank) | best-of |
| Transactional | landing-page |
5. **Set word count targets:**
- Pillar page: 2500-4000 words
- Spoke posts: 1200-1800 words
6. **Cannibalization check** — No two posts share the same primary keyword. If SERP
overlap is 7+, merge those keywords into a single post targeting both.
### Step 5: Internal Link Matrix
Design the bidirectional linking structure:
| Link Type | Direction | Requirement |
|-----------|-----------|-------------|
| Spoke to pillar | spoke -> pillar | Mandatory (every spoke) |
| Pillar to spoke | pillar -> spoke | Mandatory (every spoke) |
| Spoke to spoke (within cluster) | spoke <-> spoke | 2-3 links per post |
| Cross-cluster | spoke -> spoke (other cluster) | 0-1 links per post |AI image generation for SEO assets: OG/social preview images, blog hero images, schema images, product photography, infographics. Powered by Gemini via nanobanana-mcp. Requires banana extension installed. Use when user says \"generate image\", \"OG image\", \"social preview\", \"hero image\", \"blog image\", \"product photo\", \"infographic\", \"seo image\", \"create visual\", \"image-gen\", \"favicon\", \"schema image\", \"pinterest pin\", \"generate visual\", \"banner\", or \"thumbnail\".
>
>
Full website SEO audit with parallel subagent delegation. Crawls up to 500 pages, detects business type, delegates to up to 15 specialists (8 always + 7 conditional), generates health score. Use when user says audit, full SEO check, SEO best-practice review, analyze my site, website health check, or find SEO issues.
Backlink profile analysis: referring domains, anchor text distribution, toxic link detection, competitor gap analysis. Works with free APIs (Moz, Bing Webmaster, Common Crawl) and DataForSEO extension. Use when user says backlinks, link profile, referring domains, anchor text, toxic links, link gap, link building, disavow, or backlink audit.
>
>
>