image-compose
The image-compose skill generates images via CLI commands for creative projects, supporting standard and professional tiers with customizable dimensions, aspect ratios, and metadata. Use it to create character sheets, locations, storyboards, and edited visuals by calling generate_image.js or generate_image_pro.js with prompts and optional references to existing canvas images.
git clone --depth 1 https://github.com/Utopai-Research/pai-pro /tmp/image-compose && cp -r /tmp/image-compose/skills/image-compose ~/.claude/skills/image-composeSKILL.md
## CLI shape
Most patterns below use the standard image tier:
```
node "$PAI_REPO_ROOT/server/cli/generate_image.js" --prompt "..." [--aspect-ratio 16:9] [--image-size 2K] [--label "..."] [--subtype <character|location|edit|reference|split|storyboard>] [--name "..."] [--role "..."] [--description "..."] [--source-node-id <id>] [--ref-source-id <id> ...]
```
Storyboard mosaics and video-bound character sheets use the pro image tier:
```
node "$PAI_REPO_ROOT/server/cli/generate_image_pro.js" --prompt "..." --size 2560x1440 [--label "..."] [--subtype <character|location|edit|reference|split|storyboard>] [--name "..."] [--role "..."] [--description "..."] [--source-node-id <id>] [--ref-source-id <id> ...]
```
Pro tier accepts `--size` only. Do not pass `--aspect-ratio` or `--image-size` to `generate_image_pro.js`. Common exact sizes: `1024x1024`, `1280x720`, `720x1280`, `1920x1920`, `2560x1440`, `1440x2560`, `3840x2160`, `2160x3840`.
`$PAI_REPO_ROOT` is exported by the viewer — see the project `PROJECT_AGENT.md` § "Media CLIs (server/cli/)".
Calls go via `--stage` — see the project `PROJECT_AGENT.md` § "Draft gate" for draft and result handling.
`--label` defaults to the truncated prompt (≤30 chars) if omitted; pass an explicit one when you have a better caption.
When references are passed, refer to them in the prompt positionally as `@Image1`, `@Image2`, … in `--ref-source-id` order. The CLI emits one `derived` edge per `--ref-source-id`.
External URLs (a pasted CDN link, a moodboard image) must be mirrored onto the canvas first via `mirror_url.js --url <URL>` — the returned `node_id` then plugs into `--ref-source-id` like any other canvas source. There is no separate URL-passthrough flag.
If a canvas note authored this image (a shot note rendered as a still, a script note designing a character / location), pass `--source-node-id <note_id>` — see the project `PROJECT_AGENT.md` § "Asset, ref, and edge rules".
Do not attempt to invent images via ASCII art or markdown embedding — call the CLI.
## First-use image mode
For the ask-once flow and per-mode prices, see the project `PROJECT_AGENT.md` § "First-use generation choices".
Mode -> flag mapping: `Standard 2K` -> `generate_image.js --image-size 2K`; `Pro 2K` -> `generate_image_pro.js --size <2K exact>`; `Max quality` -> pro 4K exact size. Common pro sizes: 16:9 `2560x1440` / `3840x2160`, 9:16 `1440x2560` / `2160x3840`, 1:1 `1920x1920` / `2880x2880`.
## Patterns
Pick the one that fits. For source lookup, follow the project `PROJECT_AGENT.md` § "Choosing context"; this skill only owns image-specific prompt and CLI shape.
**Character-design pre-flight — ALWAYS run this check first when the user mentions characters.** The pivotal question is *will this character appear in downstream video work?* — anything the user calls a video, clip, promo, 宣传片, 短片, 连续剧, film, scene, 拍片, shot, or short film.
1. Read `./workflow.json` to see whether uploaded reference image nodes (`data.subtype = "reference"`, `data.metadata.source = "user_upload"`, not archived) exist for each character the user named.
2. If the character WILL appear in downstream video work (regardless of ref count) → **use Pattern 7 (4-panel character reference sheet)**, NOT Pattern 1. With ≥3 actor refs, Pattern 7 triangulates from the photos; with 0-2 refs, Pattern 7 still emits the 4-panel layout from a text description alone. Either way, the video model needs multi-view anchor data to keep the character recognizable across non-front shots.
3. Briefly announce the choice in chat before firing: *"Starting with a 4-panel reference sheet for [character] — locks identity across the video shots. Tell me if you want simple single portraits via Pattern 1 instead."* Give the user one beat to redirect.
4. If the character is one-off and will NOT feed video gen (a poster, print art, a standalone illustration), use Pattern 1.
This pre-flight is non-negotiable. Pattern 1's single front portrait gives the video model an anchor that's too narrow; identity drifts shot-to-shot. Skipping straight to Pattern 1 for video work is the single most-common mistake.
### 1. Character portrait (one-off static stills only)
Triggers: "design / create / introduce / cast a character / protagonist / antagonist / hero / villain / lead / portrait / headshot" **AND** the output is a one-off static still (poster, print art, single illustration) — NOT character work that will feed video gen. (Video-bound character work → Pattern 7; see the pre-flight above.)
- `node "$PAI_REPO_ROOT/server/cli/generate_image.js" --prompt "..." --aspect-ratio 9:16 --image-size 2K --subtype character --name "Detective Morris" --role "..." --description "..."` — **no refs**. A character is an identity anchor, not a derivative.
- Prompt template:
> `[style] character portrait of [NAME], [role]. [age, build, wardrobe, distinguishing features]. Front-facing medium close-up, eye-level, looking directly at camera, neutral expression. Plain neutral background, soft even lighting. No dramatic shadows, no stylized lighting, no side profile, no multiple views.`
- Inherit the project's style if one is already established on the canvas; otherwise default to realistic. Name the character if the user didn't ("Detective Morris", "The Prospector").
- No edges — characters are roots, so no `--ref-source-id`.
### 2. Location establishing still
Triggers: "establish / design / picture [LOCATION]", or "yes" to a `script-compose` parse offer listing locations.
- `node "$PAI_REPO_ROOT/server/cli/generate_image.js" --prompt "..." --aspect-ratio 16:9 --image-size 2K --subtype location --name "Causeway" --description "..." [--source-node-id <script_or_shot_note_id>]` — **no refs**. A location is a setting anchor, not a derivative.
- Prompt template:
> `[style] establishing still of [LOCATION NAME]. [visual brief — architecture, lighting, atmosphere]. Wide shot, eye-level, no characters present.`
- Keep the frame empty of characters — locatDesigns and maintains semantic groupings and readable layouts on the filmmaking canvas — scenes, character-reference sets, act beats, and other titled visual frames. Use when nodes on the canvas cluster around a shared meaning and would read more clearly if arranged together and wrapped in a frame. Don't force it — groups are a view concern, not an organizing tax.
>-
>-
Generates and prompts video clips on the filmmaking canvas. Use when the user asks to generate, render, animate, continue, restyle, edit, shoot, or compose a video clip; render script or shot notes as video; animate a storyboard, starting frame, image, character, location, or reference; use image, video, audio, storyboard, starting-frame, or voice refs; compose an ad, brand film, product promo, music-video shot, or video sequence; or before calling generate_video.js. Owns video CLI flags, refs, prompt construction, audio-ref handling, and video-specific failure hints.
Designs and attaches voice samples or final narration/line audio on the filmmaking canvas via the local generate_voice.js CLI. Use before calling generate_voice.js; when the user asks to give a character a voice, preview how a character sounds, create a reusable timbre anchor for video dialogue, or create exact narration/VO/final line audio.