Skill3.9k repo starsupdated yesterday

muapi-ai-fight-scene

This Claude Code skill generates dynamic fight scenes by creating a 16-cell storyboard image and converting it to video using image-to-video technology. Use it to produce high-cut-density action sequences where visual momentum comes from rapid shot changes rather than single cinematic shots. The skill accepts character descriptions, environment details, action beats, and style parameters, then orchestrates multiple AI models to generate a character reference sheet, environment concept art, and finally a Seedance 2.0-powered video following the storyboard choreography.

View source Repository: Generative-Media-Skills

Install in Claude Code

Copy

git clone --depth 1 https://github.com/SamurAIGPT/Generative-Media-Skills /tmp/muapi-ai-fight-scene && cp -r /tmp/muapi-ai-fight-scene/library/motion/ai-fight-scene ~/.claude/skills/muapi-ai-fight-scene

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# AI Fight Scene Generator

**Generate a high-cut-density action / fight scene by first composing a 16-cell storyboard image, then driving Seedance 2.0 image-to-video off that storyboard.**

The core idea: **action tension comes from cut density, not single-shot quality.** Forcing the video model to follow a pre-drawn 4×4 storyboard grid gives you 16 distinct shots in a 15-second clip — landing punches, reverse angles, ECUs, whip-pans — that no t2v prompt could choreograph on its own.

## Inputs

| Name | Type | Required | Default | Description |
|:---|:---|:---|:---|:---|
| `character_description` | text | yes | — | Full physical description of the fighter(s). Asymmetric details (eye colour, scar side, holster on left hip) help the model preserve identity across panels. |
| `environment_description` | text | yes | — | The scene setting — e.g. "cyberpunk wet back-alley, neon kanji signage, Stray-game aesthetic, rain on chrome." |
| `action_script` | text | yes | — | The action beat — prose or numbered beats. E.g. "Hero is cornered → blocks first punch → counter-elbow → throw opponent into trash cans → finisher." |
| `style_direction` | text | no | cinematic action film, anamorphic lens, high contrast, motion blur on hits | Aesthetic / look tags applied to every frame. |
| `duration` | int | no | 15 | Final video length in seconds. The storyboard's 16 cells map roughly 1 shot per second at default. |
| `aspect_ratio` | text | no | 16:9 | Output aspect — `16:9` cinematic, `9:16` vertical, `1:1` square. |


## Steps

### Phase A — Character Sheet

Generate a clean turnaround-style character sheet using `muapi image generate` (model=`gpt-image-2-text-to-image`):

- Prompt: `Character reference sheet of {{character_description}}. Three views — front, 3/4, profile — on a neutral grey backdrop. Studio lighting, full body, no text overlays, photoreal. Asymmetric identifying details preserved on the correct side. {{style_direction}}.`
- Aspect ratio: `3:2`

Present the character sheet and confirm identity details look right before proceeding. **This image becomes reference #1 for later phases.**

### Phase B — Environment Concept

Use `muapi image generate` (model=`nano-banana-2`) to design the scene/world:

- Prompt: `Wide establishing shot of {{environment_description}}. No characters in frame — environment only. Strong perspective lines, depth, atmospheric haze. {{style_direction}}. Production-design concept art.`
- Aspect ratio: `{{aspect_ratio}}`

Nano-Banana-2 is chosen here for its reasoning-driven composition — it's better than text-to-image-only models at producing locations with believable spatial logic (chokepoints, cover, sightlines) that an action scene can use. Present for approval. **This becomes reference #2.**

### Phase C — 16-Cell Storyboard

Compose the action onto a single 4×4 storyboard image using `muapi image edit` (model=`gpt-image-2-image-to-image`):

- Reference Images: the character sheet from Phase A **and** the environment plate from Phase B.
- Prompt:
  ```
  Compose a 4×4 storyboard grid (16 numbered cells) for the following action sequence:
  {{action_script}}

  CHARACTER (use reference image 1 identity throughout, asymmetric details preserved):
  {{character_description}}

  LOCATION (use reference image 2 spatial layout):
  {{environment_description}}

  Each cell labels: SHOT # (1–16) · SIZE (WIDE / MS / CU / ECU) · CAMERA-MOVE arrow (push, pull, whip, dolly, crash-zoom, handheld) · 1-word RHYTHM note (BEAT / IMPACT / RECOVERY / RESET).

  Vary shot size aggressively — never two WIDEs in a row. Land every IMPACT on a CU or ECU.
  Hand-drawn comic-book ink-and-wash style, monochrome with selective red accents on hits.
  Numbered cells, clear gutters between panels.

  Aesthetic: {{style_direction}}.
  ```
- Aspect ratio: `1:1` (square works best for a 4×4 grid)

Present the storyboard to the user. Confirm:
- The 16 shots read clearly
- Identity stays consistent cell-to-cell
- Cut density / shot-size variation looks aggressive enough

If a panel reads poorly, regenerate just the storyboard with that cell's note bolded ("CELL 7 must be an ECU on the right fist").

### Phase D — Storyboard → Video (Seedance 2.0)

Hand the storyboard to `muapi video from-image` (model=`seedance-v2.0-i2v`):

- Reference Image: the 16-cell storyboard from Phase C.
- Prompt:
  ```
  Generate a {{duration}}-second action sequence that strictly follows the 16-cell storyboard reference image, cell-by-cell, top-left to bottom-right.

  - Honour each cell's labelled SHOT SIZE and CAMERA-MOVE — match cuts to the storyboard's rhythm notes.
  - Strong cinematic feel and shot language. Exaggerated dynamics. Hits land hard with motion blur and impact frames.
  - Camera language: anamorphic, handheld where the storyboard calls for it, locked-off where it doesn't.
  - Native audio: impact sfx on every IMPACT cell, footsteps, fabric/Foley, restrained low score under the action.

  Action being rendered: {{action_script}}.
  Aesthetic: {{style_direction}}.
  ```
- Duration: `{{duration}}` (default 15)
- Aspect ratio: `{{aspect_ratio}}`

After generation, present the final video. If the cut density feels too low or shots don't match the storyboard, regenerate Phase D first (cheaper than rebuilding the storyboard) with the prompt emphasising "strict cell-by-cell adherence" more aggressively.

## Notes

- **Why the storyboard image and not a text storyboard?** Seedance 2.0 i2v anchors its motion plan to the visual reference. A grid of 16 drawn cells gives it 16 visual targets to hit — text descriptions of shots get averaged into mush.
- **Asymmetric character details matter.** Without something like "scar over the right eyebrow" or "leather glove on the left hand only", identity drift between cells is the #1 failure mode.
- **Use `seedance-2.0-i2v-480p` to draft.** Cheaper preview pass before committing to the full-res `seedance-v2.0-i2v` run.
- **For longer fights**, chain two runs: first run uses