Skill3.9k estrellas del repoactualizado 2d ago

muapi-nano-banana

This Claude Code skill transforms raw user requests into structured creative briefs for image generation via muapi.ai, applying reasoning-driven prompting based on Google's Gemini 3 architecture. It's designed for AI agents needing to generate high-fidelity images through logic-based prompts that emphasize physical consistency, spatial relationships, and precise specifications rather than keyword-heavy descriptions. Use this when image generation requires detailed control over composition, lighting, text integration, and photorealistic output quality.

Ver fuente Repositorio: Generative-Media-Skills

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/SamurAIGPT/Generative-Media-Skills /tmp/muapi-nano-banana && cp -r /tmp/muapi-nano-banana/library/visual/nano-banana ~/.claude/skills/muapi-nano-banana

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

# 🍌 Nano-Banana Expert Skill (Gemini 3 Style)

**A specialized skill for AI Agents to leverage "Reasoning-Driven" image generation.**
Based on the advanced prompting architecture of Google's Gemini 3 (Nano Banana Pro), this skill moves beyond keyword stuffing to structured, logic-based creative briefs.

## Core Competencies

1. **Reasoning-Driven Prompting**: Using natural language logic to define physics, lighting, and spatial relationships.
2. **Structured Creative Briefs**: Implementing the "Perfect Prompt" formula: `Subject + Action + Context + Composition + Lighting`.
3. **Text Rendering Precision**: Explicitly defining typography and signifiers for legible text integration.
4. **Contextual Grounding**: Using "Search Grounding" logic (simulated) to anchor generations in real-world accuracy.

---

## 🏗️ Technical Specification

### 1. The "Perfect Prompt" Formula

| Component | Description | Example |
| :--- | :--- | :--- |
| **Subject** | Detailed entity description | "A stoic robot barista with exposed copper wiring" |
| **Action** | Dynamic interaction | "Pouring a latte art leaf with mechanical precision" |
| **Context** | Environment & Atmosphere | "Inside a neon-lit cyberpunk cafe at midnight" |
| **Composition** | Camera & Lens choice | "Close-up, 85mm lens, f/1.8 aperture" |
| **Lighting** | Mood & Direction | "Volumetric blue rim light, warm cafe glow" |
| **Style** | Aesthetic anchor | "Cinematic, photorealistic, 4K production value" |

### 2. Advanced Features
- **Negative Constraint Logic**: Instead of "no blurry," use "Ensure sharp focus on the subject's eyes."
- **Identity Consistency**: (Simulated) "Maintain consistent facial structure across variations."
- **Text Integration**: Use double quotes for specific text: `The sign reads "OPEN 24/7"`.

---

## 🧠 Prompt Optimization Protocol (Agent Instruction)

**Before calling the script, the Agent MUST rewrite the user's prompt into a logic-driven Reasoning Brief:**

1. **NO KEYWORD SOUP**: Remove "8k, masterpiece, ultra-detailed." Use full, descriptive sentences.
2. **PHYSICAL CONSISTENCY**: Describe how elements interact (e.g., "The light from the crystal shards casts caustic patterns across the obsidian floor").
3. **TEXT PRECISION**: If the user wants text, define it precisely: `featuring a sign that says "STORE NAME" in a weathered serif font`.
4. **OPTICAL DIRECTIVES**: Specify lens behavior: *Shallow Depth of Field (f/1.8)*, *Macro Lens*, *Anamorphic Flare*.

---

## 🚀 Protocol: Using Nano-Banana

### Step 1: Define the Creative Logic
Provide the agent with a subject and a specific scenario.

### Step 2: Invoke the Script
The `generate-nano-art.sh` script translates the logic into a structured Gemini 3-style prompt.

```bash
# Generating a reasoning-driven image
bash scripts/generate-nano-art.sh \
  --subject "a glass chess piece" \
  --action "shattering into liquid shards" \
  --context "on a obsidian table" \
  --style "macro photography"
```

---

## ⚠️ Constraints & Guardrails

- **No Keyword Soup**: **MANDATORY** - Do not use "trending on artstation, masterpiece, 8k". Use natural language descriptions.
- **Physics Logic**: Ensure the prompt describes *physically possible* lighting and reflection interactions.
- **Full Sentences**: The model parses relationships; use "light reflecting off the water" instead of "water, reflection".

---

## ⚙️ Implementation Details
This skill applies a "Logic Wrapper" around the `core/media/generate-image.sh` primitive, converting fragmented inputs into a coherent, reasoning-ready narrative prompt.

Del mismo repositorio

muapi-media-editingSkill

Edit and enhance images and videos with AI via muapi.ai — prompt-based editing, upscaling, background removal, face swap, lipsync, video effects, and more

muapi-media-generationSkill

Generate AI images, videos, music, and audio from the terminal via muapi.ai — supports 100+ models including Flux, Midjourney v7, Kling 3.0, Veo3, and Suno V5

muapi-platformSkill

Setup and utility scripts for muapi.ai — configure API keys, test connectivity, and poll for async generation results

muapi-ai-clippingSkill

Turn a long video into N viral-ready short clips with a single managed API call. Wraps muapi.ai's `/ai-clipping` endpoint, which handles transcription, highlight ranking through a virality framework (hook / emotional peak / opinion bomb / revelation / conflict / quotable / story peak / practical value), overlap dedupe, and vertical face-tracking auto-crop server-side. No local Whisper, no local LLM, no GPU.

muapi-3d-logo-animationSkill

Transform a 2D logo into a premium 3D version and animate it with professional cinematic effects.

muapi-ai-fight-sceneSkill

Generate a high-cut-density action / fight scene by first composing a 16-cell storyboard image, then driving Seedance 2.0 image-to-video off that storyboard. Stacks GPT-Image-2 (character sheet + storyboard), Nano-Banana-2 (environment concept), and Seedance 2.0 i2v.

muapi-animal-video-generatorSkill

Create a hilarious and ultra-realistic video of an anthropomorphic animal acting like a human vlogger in a real-world setting.

muapi-award-ceremony-videoSkill

Generate a 15-second cinematic awards-ceremony video — a host announces a winner from the stage, a spotlight finds them in the crowd, they walk up to the podium, receive the award, and the LED display reveals their name and "THE BEST ACTOR".