Skip to main content
ClaudeWave
Skill743 repo starsupdated today

extract-source-sample

# extract-source-sample This skill reads a finished content-goose ad-run folder and assembles a `source-sample.json` file containing all components that define the ad: script, shot list, voice-over details, character references linked to the central character library, production scripts, and master video file. Use it when preparing an existing ad for remixing, as the output JSON feeds directly into script-rewriting and the `remix-ad` skill.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/gooseworks-ai/goose-skills /tmp/extract-source-sample && cp -r /tmp/extract-source-sample/skills/ads/composites/extract-source-sample ~/.claude/skills/extract-source-sample
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# extract-source-sample

This is an **agent-executed** skill. There are no Python scripts. The agent
reads the run folder, builds the JSON, and stamps catalog links by hand. The
content-goose run folders aren't always cleanly structured (some have empty
production/ JSON, some carry everything in working/) — an agent adapts, a
script would brittle out.

## When to use

- "Extract the source-sample.json for `<run>`."
- "Get the upload-sample JSON for this ad so I can remix it."
- "Prep `<run>` for remix."

Do NOT use to:
- Rewrite the script for a new brand (that's a separate agent step that
  consumes this skill's output).
- Render the remix (that's the existing `remix-ad` skill).
- Upload an ad to the library (that's `upload-ad-sample`).

## Inputs

| Input | Required | Notes |
|---|---|---|
| `run-dir` | yes | Absolute path to a content-goose ad-run folder (e.g. `clients/ladder/ad-runs/run-02-podcast-skit`). |
| `out` | no | Where to write the JSON. Default: `<run-dir>/remix/source-sample.json`. |

That's the entire interface.

## What the agent must do

### 1. Read the run

Open each file if it exists; tolerate missing files (most production/*.json
in older runs are empty stubs — fall back to `working/`):

- `working/script.json` — **primary source of truth** for scenes, voices, set.
- `production/asset-manifest.json` — `assets[]` with role `active_master`
  points at the master mp4; per-asset `provider` + `metadata.model` produce
  the atom-skill rows.
- `HOW_TO_MAKE_THIS_VIDEO.md` — gets dumped verbatim into `how_to`.
- `video-project.json` — fallback for title / format when script.json doesn't
  carry them.
- `finals/*.mp4` — fallback for the master mp4 if asset-manifest is empty.
- `working/characters/*.png` — anchor portraits per character.
- `working/*.py` — driver scripts (`render_vo.py`, `render_variants.py`,
  `render_clips.py`, `stitch.py`, `build_end_card.py`, etc.). These are the
  source's runnable code; the remix consumer ports them. Capture in
  `production_scripts[]` (step 2 below).

**For sources with character-pose stills** (any run with a
`working/characters/` folder of `<character>-<pose>.png` files —
podcast-skit, founder-led, testimonial, recreate-ugc, etc.), audit every
PNG with `file`. Do NOT stop at the base portraits. The recipe shot list
references variant expression PNGs (e.g. `brittney-eyebrow-up.png`,
`brad-phone-up.png`) by filename; the consumer assumes they exist on disk
and will spend real money on lipsync calls before discovering they don't.

For sources without character-pose stills (music-video b-roll, abstract
animated, product-only) — skip this audit; `variant_assets[]` stays empty.

Record per file (when auditing):

```jsonc
{
  "file":        "brittney-eyebrow-up.png",
  "pose_tag":    "skeptical-eyebrow",                  // slug from filename stem
  "kind":        "real" | "lfs-pointer" | "missing",
  "size_bytes":  142336
}
```

`file <path>` says `PNG image data, …` for real binaries and `ASCII text`
for LFS pointers. A real binary is `>10KB` in practice; an LFS pointer is
`<200 bytes`.

**Materialize LFS pointers before reading any binary.** An LFS pointer is a
tiny (<200 byte) ASCII file beginning with `version https://git-lfs.github.com`.
If a PNG or mp4 looks like one, run:

```bash
cd <run-dir-or-repo-root>
git lfs fetch --include=<relative path>
git lfs checkout <relative path>
```

before referencing it. **If `git lfs pull` no-ops and the LFS endpoint
returns 404** (objects committed as pointers but never pushed — common on
content-goose), leave the entry as `kind: "lfs-pointer"` in
`variant_assets[]`. The consumer will regenerate or scrape; this skill
does NOT fabricate. See [[feedback_lfs_pointer_audit_before_paid_calls]]
and [[feedback_fal_subscribe_error_envelope]] for the downstream cost when
this audit is skipped — Hume run-03 lost ~$3 + 25 min to it.

### 2. Build `source-sample.json`

Shape (every key always present, arrays may be empty):

```jsonc
{
  "title":              "<from script.json or video-project.json>",
  "format":             "video",
  "ratio":              "<aspect_ratio from script.json — e.g. 9:16>",
  "formatProfile":      "podcast-skit-fabricated",     // enum, see below
  "media_url":          "file://<abs path to master mp4>",
  "thumbnail_url":      null,
  "brand":              "<derive from path: clients/<brand>/ad-runs/...>",
  "tags":               [],
  "recipe":             { "shots": [...], "total_duration_sec": <int> },
  "extracted_script":   "HER: …\nHIM: …\n…",
  "skills_used":        ["generate-voiceover", "..."],     // atoms only
  "skills_source":      "measured" | "derived-from-production-scripts" | "inferred-canonical" | "guessed",
  "how_to":             "<contents of HOW_TO_MAKE_THIS_VIDEO.md or null>",
  "production_scripts": [
    { "path": "working/render_vo.py",       "role": "voiceover" },
    { "path": "working/render_variants.py", "role": "stills"    },
    { "path": "working/render_clips.py",    "role": "lipsync"   },
    { "path": "working/stitch.py",          "role": "stitch"    },
    { "path": "working/build_end_card.py",  "role": "end_card"  }
  ],
  "remix_spec": {
    "version":    1,
    "skills":     [{"slug": "...", "provider": "...", "model": "..."}],
    "worlds":     [{"key": "...", "name": "...", "set": "...", "lighting": null, "color_grade": null, "reference_image_url": null, "catalog_id": null}],
    "characters": [
      {
        "key":              "her",
        "name":             "Brittney",
        "gender":           "f",
        "soul_id":          null,
        "anchor_asset_id":  "asset-char-her-base-01",
        "anchor_image_url": "file://...png",
        "method":           "anchor-ref",
        "description":      null,
        "catalog_id":       "brittney",
        "variant_assets": [
          { "file": "brittney-base.png",          "pose_tag": "base",                "kind": "real",         "size_bytes": 1842336 },
          { "file": "brittn