seo-drift
seo-drift captures baseline snapshots of SEO-critical page elements including title tags, meta descriptions, headings, schema markup, Core Web Vitals, and HTTP status codes, then detects regressions by comparing current page state against stored baselines using 17 rules across three severity levels. Use this skill to monitor for unintended SEO changes, track optimization progress over time, and receive alerts when modifications risk search visibility.
git clone --depth 1 https://github.com/AgriciDaniel/codex-seo /tmp/seo-drift && cp -r /tmp/seo-drift/skills/seo-drift ~/.claude/skills/seo-driftSKILL.md
# SEO Drift Monitor (April 2026)
## Shared Data Cache
**Step 0 -- Check shared data cache:**
Before gathering, check `.seo-cache/` for reusable context from related SEO skills.
Reference: `../seo/references/shared-data-cache.md` for schemas and dependency map.
Check these cache files when present:
- `.seo-cache/site-meta.json` for domain, business type, industry, and crawl context
- `.seo-cache/audit-scores.json` for prior full-audit priorities
- `.seo-cache/pages/{url-slug}/page-analysis.json` for page-level context when a URL is provided
- If found: parse and use clearly valid fields (note "Using cached [X] from [date]")
- If missing, corrupt, or irrelevant: continue with fresh evidence
- If the user says "refresh" or "re-run": ignore cache reads and overwrite on write
Git for your SEO. Capture baselines, detect regressions, track changes over time.
---
## Commands
| Command | Purpose |
|---------|---------|
| `/seo drift baseline <url>` | Capture current SEO state as a "known good" snapshot |
| `/seo drift compare <url>` | Compare current page state to stored baseline |
| `/seo drift history <url>` | Show change history and past comparisons |
---
## What It Captures
Every baseline records these SEO-critical elements:
| Element | Field | Source |
|---------|-------|--------|
| Title tag | `title` | `parse_html.py` |
| Meta description | `meta_description` | `parse_html.py` |
| Canonical URL | `canonical` | `parse_html.py` |
| Robots directives | `meta_robots` | `parse_html.py` |
| H1 headings | `h1` (array) | `parse_html.py` |
| H2 headings | `h2` (array) | `parse_html.py` |
| H3 headings | `h3` (array) | `parse_html.py` |
| JSON-LD schema | `schema` (array) | `parse_html.py` |
| Open Graph tags | `open_graph` (dict) | `parse_html.py` |
| Core Web Vitals | `cwv` (dict) | `pagespeed_check.py` |
| HTTP status code | `status_code` | `fetch_page.py` |
| HTML content hash | `html_hash` (SHA-256) | Computed |
| Schema content hash | `schema_hash` (SHA-256) | Computed |
---
## How Comparison Works
The comparison engine applies **17 rules across 3 severity levels**. Load
`references/comparison-rules.md` for the full rule set with thresholds,
recommended actions, and cross-skill references.
### Severity Levels
| Level | Meaning | Response Time |
|-------|---------|---------------|
| **CRITICAL** | SEO-breaking change, likely traffic loss | Immediate |
| **WARNING** | Potential impact, needs investigation | Within 1 week |
| **INFO** | Awareness only, may be intentional | Review at convenience |
---
## Storage
All data is stored locally in SQLite:
```
~/.cache/codex-seo/drift/baselines.db
```
### Tables
- **baselines**: Captured snapshots with all SEO elements
- **comparisons**: Diff results with triggered rules and severities
URL normalization ensures consistent matching: lowercase scheme/host, strip
default ports (80/443), sort query parameters, remove UTM parameters, strip
trailing slashes.
---
## Command: `baseline`
Captures the current state of a page and stores it.
**Steps:**
1. Validate URL (SSRF protection via `google_auth.validate_url()`)
2. Fetch page via `scripts/fetch_page.py`
3. Parse HTML via `scripts/parse_html.py`
4. Optionally fetch CWV via `scripts/pagespeed_check.py` (use `--skip-cwv` to skip)
5. Hash HTML body and schema content (SHA-256)
6. Store snapshot in SQLite
**Execution:**
```bash
python scripts/drift_baseline.py <url>
python scripts/drift_baseline.py <url> --skip-cwv
```
**Output:** JSON with baseline ID, timestamp, URL, and summary of captured elements.
---
## Command: `compare`
Fetches the current page state and diffs it against the most recent baseline.
**Steps:**
1. Validate URL
2. Load most recent baseline from SQLite (or specific `--baseline-id`)
3. Fetch and parse current page state
4. Run all 17 comparison rules
5. Classify findings by severity
6. Store comparison result
7. Output JSON diff report
**Execution:**
```bash
python scripts/drift_compare.py <url>
python scripts/drift_compare.py <url> --baseline-id 5
python scripts/drift_compare.py <url> --skip-cwv
```
**Output:** JSON with all triggered rules, old/new values, severity, and actions.
After comparison, offer to generate an HTML report:
```bash
python scripts/drift_report.py <comparison_json_file> --output drift-report.html
```
---
## Command: `history`
Shows all baselines and comparisons for a URL.
**Execution:**
```bash
python scripts/drift_history.py <url>
python scripts/drift_history.py <url> --limit 10
```
**Output:** JSON array of baselines (newest first) with timestamps and comparison summaries.
---
## Cross-Skill Integration
When drift is detected, recommend the appropriate specialized skill:
| Finding | Recommendation |
|---------|----------------|
| Schema removed or modified | Run `/seo schema <url>` for full validation |
| CWV regression | Run `/seo technical <url>` for performance audit |
| Title or meta description changed | Run `/seo page <url>` for content analysis |
| Canonical changed or removed | Run `/seo technical <url>` for indexability check |
| Noindex added | Run `/seo technical <url>` for crawlability audit |
| H1/heading structure changed | Run `/seo content <url>` for E-E-A-T review |
| OG tags removed | Run `/seo page <url>` for social sharing analysis |
| Status code changed to error | Run `/seo technical <url>` for full diagnostics |
---
## Error Handling
| Scenario | Action |
|----------|--------|
| URL unreachable | Report error from `fetch_page.py`. Do not guess state. Suggest user verify URL. |
| No baseline exists for URL | Inform user and suggest running `baseline` first. |
| SSRF blocked (private IP) | Report `validate_url()` rejection. Never bypass. |
| SQLite database missing | Auto-create on first use. No error. |
| CWV fetch fails (no API key) | Store `null` for CWV fields. Skip CWV rules during comparison. |
| Page returns 4xx/5xx | Still capture as baseline (status code IS a tracked field). |
| Multiple baseliAI image generation for SEO assets: OG/social preview images, blog hero images, schema images, product photography, infographics. Powered by Gemini via nanobanana-mcp. Requires banana extension installed. Use when user says \"generate image\", \"OG image\", \"social preview\", \"hero image\", \"blog image\", \"product photo\", \"infographic\", \"seo image\", \"create visual\", \"image-gen\", \"favicon\", \"schema image\", \"pinterest pin\", \"generate visual\", \"banner\", or \"thumbnail\".
>
>
Full website SEO audit with parallel subagent delegation. Crawls up to 500 pages, detects business type, delegates to up to 15 specialists (8 always + 7 conditional), generates health score. Use when user says audit, full SEO check, SEO best-practice review, analyze my site, website health check, or find SEO issues.
Backlink profile analysis: referring domains, anchor text distribution, toxic link detection, competitor gap analysis. Works with free APIs (Moz, Bing Webmaster, Common Crawl) and DataForSEO extension. Use when user says backlinks, link profile, referring domains, anchor text, toxic links, link gap, link building, disavow, or backlink audit.
>
>
>