fal-ai-media
The fal-ai-media Claude Code skill enables AI-powered generation of images, videos, and audio files through fal.ai's models via MCP integration. It provides access to specialized generators including text-to-image (Nano Banana), text/image-to-video (Seedance, Kling, Veo 3), text-to-speech (CSM-1B), and video-to-audio (ThinkSound). Activate this skill when users request image creation, video generation, speech synthesis, or other media production tasks.
git clone --depth 1 https://github.com/affaan-m/ECC /tmp/fal-ai-media && cp -r /tmp/fal-ai-media/.agents/skills/fal-ai-media ~/.claude/skills/fal-ai-mediaSKILL.md
# fal.ai Media Generation
Generate images, videos, and audio using fal.ai models via MCP.
## When to Activate
- User wants to generate images from text prompts
- Creating videos from text or images
- Generating speech, music, or sound effects
- Any media generation task
- User says "generate image", "create video", "text to speech", "make a thumbnail", or similar
## MCP Requirement
fal.ai MCP server must be configured. Add to `~/.claude.json`:
```json
"fal-ai": {
"command": "npx",
"args": ["-y", "fal-ai-mcp-server"],
"env": { "FAL_KEY": "YOUR_FAL_KEY_HERE" }
}
```
Get an API key at [fal.ai](https://fal.ai).
## MCP Tools
The fal.ai MCP provides these tools:
- `search` — Find available models by keyword
- `find` — Get model details and parameters
- `generate` — Run a model with parameters
- `result` — Check async generation status
- `status` — Check job status
- `cancel` — Cancel a running job
- `estimate_cost` — Estimate generation cost
- `models` — List popular models
- `upload` — Upload files for use as inputs
---
## Image Generation
### Nano Banana 2 (Fast)
Best for: quick iterations, drafts, text-to-image, image editing.
```
generate(
model_name: "fal-ai/nano-banana-2",
input: {
"prompt": "a futuristic cityscape at sunset, cyberpunk style",
"image_size": "landscape_16_9",
"num_images": 1,
"seed": 42
}
)
```
### Nano Banana Pro (High Fidelity)
Best for: production images, realism, typography, detailed prompts.
```
generate(
model_name: "fal-ai/nano-banana-pro",
input: {
"prompt": "professional product photo of wireless headphones on marble surface, studio lighting",
"image_size": "square",
"num_images": 1,
"guidance_scale": 7.5
}
)
```
### Common Image Parameters
| Param | Type | Options | Notes |
|-------|------|---------|-------|
| `prompt` | string | required | Describe what you want |
| `image_size` | string | `square`, `portrait_4_3`, `landscape_16_9`, `portrait_16_9`, `landscape_4_3` | Aspect ratio |
| `num_images` | number | 1-4 | How many to generate |
| `seed` | number | any integer | Reproducibility |
| `guidance_scale` | number | 1-20 | How closely to follow the prompt (higher = more literal) |
### Image Editing
Use Nano Banana 2 with an input image for inpainting, outpainting, or style transfer:
```
# First upload the source image
upload(file_path: "/path/to/image.png")
# Then generate with image input
generate(
model_name: "fal-ai/nano-banana-2",
input: {
"prompt": "same scene but in watercolor style",
"image_url": "<uploaded_url>",
"image_size": "landscape_16_9"
}
)
```
---
## Video Generation
### Seedance 1.0 Pro (ByteDance)
Best for: text-to-video, image-to-video with high motion quality.
```
generate(
model_name: "fal-ai/seedance-1-0-pro",
input: {
"prompt": "a drone flyover of a mountain lake at golden hour, cinematic",
"duration": "5s",
"aspect_ratio": "16:9",
"seed": 42
}
)
```
### Kling Video v3 Pro
Best for: text/image-to-video with native audio generation.
```
generate(
model_name: "fal-ai/kling-video/v3/pro",
input: {
"prompt": "ocean waves crashing on a rocky coast, dramatic clouds",
"duration": "5s",
"aspect_ratio": "16:9"
}
)
```
### Veo 3 (Google DeepMind)
Best for: video with generated sound, high visual quality.
```
generate(
model_name: "fal-ai/veo-3",
input: {
"prompt": "a bustling Tokyo street market at night, neon signs, crowd noise",
"aspect_ratio": "16:9"
}
)
```
### Image-to-Video
Start from an existing image:
```
generate(
model_name: "fal-ai/seedance-1-0-pro",
input: {
"prompt": "camera slowly zooms out, gentle wind moves the trees",
"image_url": "<uploaded_image_url>",
"duration": "5s"
}
)
```
### Video Parameters
| Param | Type | Options | Notes |
|-------|------|---------|-------|
| `prompt` | string | required | Describe the video |
| `duration` | string | `"5s"`, `"10s"` | Video length |
| `aspect_ratio` | string | `"16:9"`, `"9:16"`, `"1:1"` | Frame ratio |
| `seed` | number | any integer | Reproducibility |
| `image_url` | string | URL | Source image for image-to-video |
---
## Audio Generation
### CSM-1B (Conversational Speech)
Text-to-speech with natural, conversational quality.
```
generate(
model_name: "fal-ai/csm-1b",
input: {
"text": "Hello, welcome to the demo. Let me show you how this works.",
"speaker_id": 0
}
)
```
### ThinkSound (Video-to-Audio)
Generate matching audio from video content.
```
generate(
model_name: "fal-ai/thinksound",
input: {
"video_url": "<video_url>",
"prompt": "ambient forest sounds with birds chirping"
}
)
```
### ElevenLabs (via API, no MCP)
For professional voice synthesis, use ElevenLabs directly:
```python
import os
import requests
resp = requests.post(
"https://api.elevenlabs.io/v1/text-to-speech/<voice_id>",
headers={
"xi-api-key": os.environ["ELEVENLABS_API_KEY"],
"Content-Type": "application/json"
},
json={
"text": "Your text here",
"model_id": "eleven_turbo_v2_5",
"voice_settings": {"stability": 0.5, "similarity_boost": 0.75}
}
)
with open("output.mp3", "wb") as f:
f.write(resp.content)
```
### VideoDB Generative Audio
If VideoDB is configured, use its generative audio:
```python
# Voice generation
audio = coll.generate_voice(text="Your narration here", voice="alloy")
# Music generation
music = coll.generate_music(prompt="upbeat electronic background music", duration=30)
# Sound effects
sfx = coll.generate_sound_effect(prompt="thunder crack followed by rain")
```
---
## Cost Estimation
Before generating, check estimated cost:
```
estimate_cost(model_name: "fal-ai/nano-banana-pro", input: {...})
```
## Model Discovery
Find models for specific tasks:
```
search(query: "text to video")
find(model_name: "fal-ai/seedance-1-0-pro")
models()
```
## Tips
- Use `seed` for reproducible results when iterating on prompts
- StartStructured self-debugging workflow for AI agent failures using capture, diagnosis, contained recovery, and introspection reports.
Build an evidence-backed ECC install plan for a specific repo by sorting skills, commands, rules, hooks, and extras into DAILY vs LIBRARY buckets using parallel repo-aware review passes. Use when ECC should be trimmed to what a project actually needs instead of loading the full bundle.
>
Write articles, guides, blog posts, tutorials, newsletter issues, and other long-form content in a distinctive voice derived from supplied examples or brand guidance. Use when the user wants polished written content longer than a paragraph, especially when voice consistency, structure, and credibility matter.
>
Build a source-derived writing style profile from real posts, essays, launch notes, docs, or site copy, then reuse that profile across content, outreach, and social workflows. Use when the user wants voice consistency without generic AI writing tropes.
Bun as runtime, package manager, bundler, and test runner. When to choose Bun vs Node, migration notes, and Vercel support.
>