Multi-modal Generative Media Skills for AI Agents (Claude Code, Cursor, Gemini CLI). High-quality image, video, and audio generation powered by muapi.ai.
Generative Media Skills is a schema-driven collection of Shell scripts, SKILL.md instruction files, and an MCP server that lets AI agents running inside Claude Code, Claude Desktop, or Cursor generate and edit images, videos, and audio through the muapi-cli npm package, which proxies requests to over 100 hosted models including Midjourney v7, Flux Kontext, Kling 3.0, Seedance 2.0, and Veo3. The repository splits into core primitives covering file upload, prompt-based image editing, and auth polling, and an expert library of domain-specific skills such as Cinema Director, UI Designer, Logo Creator, and AI Clipping, which converts long videos into ranked vertical short clips with server-side transcription and face-tracked auto-crop. Running muapi mcp serve exposes all 19 tools directly to any MCP-compatible agent. Forty-one named workflow recipes, each a SKILL.md the agent reads and executes, cover end-to-end pipelines such as product photo to cinematic ad. The primary audience is developers and creative professionals building multimodal agentic workflows who want structured, LLM-readable tooling without managing raw API calls or local media processing.
- ✓Open-source license (MIT)
- ✓Actively maintained (<30d)
- ✓Healthy fork ratio
- ✓Clear description
- ✓Topics declared
- ✓Mature repo (>1y old)
git clone https://github.com/SamurAIGPT/Generative-Media-Skills ~/.claude/skills/generative-media-skills24 items in this repository
Edit and enhance images and videos with AI via muapi.ai — prompt-based editing, upscaling, background removal, face swap, lipsync, video effects, and more
Generate AI images, videos, music, and audio from the terminal via muapi.ai — supports 100+ models including Flux, Midjourney v7, Kling 3.0, Veo3, and Suno V5
Setup and utility scripts for muapi.ai — configure API keys, test connectivity, and poll for async generation results
Turn a long video into N viral-ready short clips with a single managed API call. Wraps muapi.ai's `/ai-clipping` endpoint, which handles transcription, highlight ranking through a virality framework (hook / emotional peak / opinion bomb / revelation / conflict / quotable / story peak / practical value), overlap dedupe, and vertical face-tracking auto-crop server-side. No local Whisper, no local LLM, no GPU.
Transform a 2D logo into a premium 3D version and animate it with professional cinematic effects.
Generate a high-cut-density action / fight scene by first composing a 16-cell storyboard image, then driving Seedance 2.0 image-to-video off that storyboard. Stacks GPT-Image-2 (character sheet + storyboard), Nano-Banana-2 (environment concept), and Seedance 2.0 i2v.
Create a hilarious and ultra-realistic video of an anthropomorphic animal acting like a human vlogger in a real-world setting.
Generate a 15-second cinematic awards-ceremony video — a host announces a winner from the stage, a spotlight finds them in the crowd, they walk up to the podium, receive the award, and the LED display reveals their name and "THE BEST ACTOR".
Convert a photo of a person into a Pixar-style 3D cartoon character, then animate it using a reference dance or motion video.
Create a multi-part animated story video by first establishing a consistent character and then generating sequential scenes and animating them.
Direct high-fidelity cinematic video with AI — translates creative intent into technical cinematographic directives for Veo3, Kling, and Luma video models via muapi.ai
Generate aerial drone-perspective footage — sweeping bird's-eye views, orbit shots, and flyover sequences for landscapes, architecture, and events.
Generate a cinematic "freeze effect" video where time stops mid-scene, the subject walks through the frozen world, then time resumes with a snap.
Create a dramatic "Giant Product" visual where a regular item is showcased as a massive, building-sized object next to a person, then optionally animate the scene.
Create a luxury jewelry advertisement with high-end commercial cinematography and detailed macro animation.
Build a short music video from a song theme — N keyframes, animate each, generate matching music.
Generate a single continuous cinematic shot video — no cuts, one seamless flowing scene with dramatic lighting and motion.
Cinematic 5–10s product ad from a product photo + brand brief.
Create a dynamic product showcase with explosive ingredient arrangements, followed by a realistic motion animation.
Create a high-end cinematic product video advertisement starting from a simple product photo.
Expert Cinema Director skill for Seedance 2.0 (ByteDance) — high-fidelity video generation across Chinese, Global, and VIP tiers. Supports text-to-video, image-to-video, first-last-frame, omni reference, character training, omni-reference training, video editing, and watermark removal.
Turn a single photo of a person into a 15-second cinematic pasta-making (or other cuisine) tutorial video. First builds a composite reference sheet (character + kitchen + 9-step action board), then animates the full cooking sequence with audio in a single continuous shot.
Create a viral-style video of a talking baby with custom costumes and scripts.
Generate UGC-style (User Generated Content) lifestyle photos of a person wearing or using your product — authentic, relatable, social-media-native imagery.
Skills overview
What people ask about Generative-Media-Skills
What is SamurAIGPT/Generative-Media-Skills?
+
SamurAIGPT/Generative-Media-Skills is skills for the Claude AI ecosystem. Multi-modal Generative Media Skills for AI Agents (Claude Code, Cursor, Gemini CLI). High-quality image, video, and audio generation powered by muapi.ai. It has 3.5k GitHub stars and was last updated today.
How do I install Generative-Media-Skills?
+
You can install Generative-Media-Skills by cloning the repository (https://github.com/SamurAIGPT/Generative-Media-Skills) or following the README instructions on GitHub. ClaudeWave also provides quick install blocks on this page.
Is SamurAIGPT/Generative-Media-Skills safe to use?
+
Our security agent has analyzed SamurAIGPT/Generative-Media-Skills and assigned a Trust Score of 100/100 (tier: Verified). See the full breakdown of passed checks and flags on this page.
Who maintains SamurAIGPT/Generative-Media-Skills?
+
SamurAIGPT/Generative-Media-Skills is maintained by SamurAIGPT. The last recorded GitHub activity is from today, with 4 open issues.
Are there alternatives to Generative-Media-Skills?
+
Yes. On ClaudeWave you can browse similar skills at /categories/skills, sorted by popularity or recent activity.
Deploy Generative-Media-Skills to your cloud
Ship this repo to production in minutes. Each platform spins up its own environment with editable env vars.
Maintain this repo? Add a badge to your README
Drop the badge into your GitHub README to show it's tracked on ClaudeWave. Each badge links back to this page and reflects the live Trust Score.
[](https://claudewave.com/repo/samuraigpt-generative-media-skills)<a href="https://claudewave.com/repo/samuraigpt-generative-media-skills"><img src="https://claudewave.com/api/badge/samuraigpt-generative-media-skills" alt="Featured on ClaudeWave: SamurAIGPT/Generative-Media-Skills" width="320" height="64" /></a>More Skills
A cross-platform desktop All-in-One assistant for Claude Code, Codex, OpenCode, OpenClaw, Gemini CLI & Hermes Agent. Only official website: ccswitch.io
omo/lazycodex: The coding agent for tokenmaxxers;the one and only agent harness for complex codebases. For your Codex, for your OpenCode
Graphs that teach > graphs that impress. Turn any code into an interactive knowledge graph you can explore, search, and ask questions about. Works with Claude Code, Codex, Cursor, Copilot, Gemini CLI, and more.
Turn any AI agent into an AI Scientist. The #1 Agent Skills library for science, used by 160,000+ scientists worldwide. 140 ready-to-use skills plus 100+ scientific databases covering biology, chemistry, medicine, and drug discovery. Compatible with Cursor, Claude Code, Codex, Antigravity, and the open Agent Skills standard.
A curated collection of 1000+ agent skills from official dev teams and the community, compatible with Claude Code, Codex, Gemini CLI, Cursor, and more.
No description provided.