Skip to main content
ClaudeWave
Skill393 estrellas del repoactualizado today

create-voice

The create-voice Claude Code skill orchestrates a seven-phase pipeline that transforms writing samples into complete voice profiles by delegating analysis to existing tools (voice-analyzer.py, voice-validator.py) and the voice-calibrator template. Use this when you need to systematize an author's distinctive linguistic patterns, thinking habits, and stylistic markers into a reusable profile that can guide content generation or writing assistance, with each phase producing persistent artifacts and passing strict validation gates before proceeding.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/notque/vexjoy-agent /tmp/create-voice && cp -r /tmp/create-voice/skills/content/create-voice ~/.claude/skills/create-voice
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

# Create Voice

Create a complete voice profile from writing samples through a 7-phase pipeline. This skill is the user-facing entry point for the voice system. It orchestrates existing tools (voice-analyzer.py, voice-validator.py, voice-calibrator template) into a guided, phase-gated workflow.

**Architecture**: This skill is a GUIDE and ORCHESTRATOR. It delegates all deterministic work to existing scripts and all template structure to the voice-calibrator skill. It does not duplicate or replace any existing component.

---

## Reference Loading Table

| Signal | Load These Files | Why |
|---|---|---|
| errors, error handling | `error-handling.md` | Loads detailed guidance from `error-handling.md`. |
| extraction validation, pattern verdict, triple-validation | `extraction-validation.md` | Triple-validation rubric (recurrence, generative power, exclusivity) gating which patterns survive into the profile. |
| Steps 6-7: validation procedure and authorship matching | `iteration-guide.md` | Loads detailed guidance from `iteration-guide.md`. |
| Step 3: PATTERN — phrase fingerprints, thinking patterns, wabi-sabi markers | `pattern-identification.md` | Loads detailed guidance from `pattern-identification.md`. |
| reporting progress at phase gates | `phase-banners.md` | Loads detailed guidance from `phase-banners.md`. |
| locating exemplar voice skills and components | `reference-implementations.md` | Loads detailed guidance from `reference-implementations.md`. |
| Step 1 COLLECT: finding, vetting, and formatting samples | `sample-collection.md` | Loads detailed guidance from `sample-collection.md`. |
| Step 5 GENERATE: skill files, frontmatter, sample organization | `skill-generation.md` | Loads detailed guidance from `skill-generation.md`. |
| Step 4 RULE: writing positive and contrastive identity rules | `voice-rules-template.md` | Loads detailed guidance from `voice-rules-template.md`. |

## Instructions

### Overview

Read and follow the repository CLAUDE.md before starting any work.

The pipeline has 7 phases. Each phase produces artifacts saved to files (because context is ephemeral; files persist) and has a gate that must pass before proceeding. Report progress with phase status banners at each gate (templates in `references/phase-banners.md`). Be direct about what passed or failed, not congratulatory.

| Phase | Name | Artifact | Gate |
|-------|------|----------|------|
| 1 | COLLECT | `skills/voice-{name}/references/samples/*.md` | 50+ samples exist |
| 2 | EXTRACT | `skills/voice-{name}/profile.json` | Script exits 0, metrics present |
| 3 | PATTERN | Pattern analysis document | 10+ phrase fingerprints identified |
| 4 | RULE | Voice rules document | Rules have contrastive examples |
| 5 | GENERATE | `skills/voice-{name}/SKILL.md` + `config.json` | SKILL.md has 2000+ lines, samples section has 400+ lines |
| 6 | VALIDATE | Validation report | Score >= 70, no banned pattern violations |
| 7 | ITERATE | Final validated skill | 4/5 authorship match (or 3 iteration limit reached) |

---

### Step 1: COLLECT -- Gather 50+ Writing Samples

**Goal**: Build a corpus of real writing that captures the full range of the person's voice.

Stop and resolve before proceeding past this step without 50+ samples, because the system tried with 3-10 and FAILED. 50+ is where it starts working. LLMs are pattern matchers -- rules tell AI what to do but samples show AI what the voice looks like. V7-V9 had correct rules but failed authorship matching (0/5 roasters). V10 passed 5/5 because it had 100+ categorized samples.

See `references/sample-collection.md` for the "Where to Find Samples" table, "Sample Quality Guidelines", "Directory Setup", and "Sample File Format".

**GATE**: Count the samples. If fewer than 50 distinct writing samples exist across all files, STOP. Tell the user how many more are needed and where to find them. Stop and resolve before proceeding.

See `references/phase-banners.md` for the Phase 1 status banner template.

---

### Step 2: EXTRACT -- Run Deterministic Analysis

**Goal**: Extract quantitative voice metrics from the samples using `voice-analyzer.py`.

Always run script-based analysis before AI interpretation, because scripts produce reproducible, quantitative baselines. AI interpretation without data drifts toward "sounds like a normal person" rather than capturing what makes THIS person distinctive. The numbers ground everything that follows.

#### Run the Analyzer

```bash
python3 ~/.claude/scripts/voice-analyzer.py analyze \
  --samples skills/voice-{name}/references/samples/*.md \
  --output skills/voice-{name}/profile.json
```

#### Also Get the Text Report

```bash
python3 ~/.claude/scripts/voice-analyzer.py analyze \
  --samples skills/voice-{name}/references/samples/*.md \
  --format text
```

The text report gives a human-readable summary. Save it for reference during Steps 3-4.

#### What the Analyzer Extracts

| Category | Metrics | Why It Matters |
|----------|---------|---------------|
| Sentence metrics | Length distribution, average, variance | Rhythm fingerprint |
| Punctuation | Comma density, question rate, exclamation rate, em-dash count, semicolons | Punctuation signature |
| Word metrics | Contraction rate, first-person rate, second-person rate | Formality and perspective |
| Structure | Fragment rate, sentence starters by type | Structural patterns |
| Function words | Top 20 function word frequencies | Unconscious language fingerprint |

#### Add Stylometry Bands and Decay Metadata

```bash
python3 scripts/voice-stylometry.py band \
  --samples skills/voice-{name}/references/samples/*.md \
  > /tmp/stylometry.json
```

Merge the output's top-level keys into `profile.json`: `stylometry` (burstiness band + punctuation classes measured from the samples), `analyzed_at`, and `refresh_after_days` (default 90). These fields are add-only; profiles without them stay valid. The voice-validator uses them for deterministic draft checks and stale-profile warnin