music-and-singing-studio
Music and Singing Studio generates instrumental background music, jingles, and sung vocal tracks using OpenSquilla audio tools. Use this skill when a user requests BGM, background beds, original music composition, sung songs with lyrics, or playable audio artifacts in any language.
git clone --depth 1 https://github.com/opensquilla/opensquilla /tmp/music-and-singing-studio && cp -r /tmp/music-and-singing-studio/src/opensquilla/skills/bundled/music-and-singing-studio ~/.claude/skills/music-and-singing-studioSKILL.md
# music-and-singing-studio Generates instrumental music or songs with sung vocals. OpenRouter can help draft original lyrics, structure prompts, or translate style notes, but audio generation must use `music_generate` or `song_generate`. ## Request triage Before calling tools, extract these fields from the user request: - task type: instrumental BGM, jingle, short demo, loop, full song, or sung lyrics - whether lyrics are user-provided, newly original, or a prohibited copyrighted cover request - target language, target locale/accent, vocal traits, backing style, mood, tempo, and desired duration - quota posture: short demo first, full generation, or user-specified duration - output expectation: one playable take, multiple variants, or background bed OpenRouter can draft original lyrics and prompts, but it is not an audio provider. Do not imply OpenRouter created the audio. ## Choose the tool - Use `music_generate` for BGM, loopable beds, intro/outro, ads, transitions, and instrumental moods. - Use `song_generate` when the user provides lyrics or asks for singing, vocals, chorus, verse, jingle with words, or 唱歌. ## Required workflow 1. Confirm whether the user wants instrumental music or sung vocals. 2. Create only original lyrics unless the user provides rights to existing lyrics. 3. Avoid "in the style of" living artists, bands, copyrighted songs, game themes, film scores, or franchise music. 4. Call `audio_provider_capabilities` if music/singing availability is uncertain. 5. For `music_generate`, pass a concise prompt, optional style, duration, and output path. 6. For `song_generate`, pass original `lyrics`, vocal style, backing style, duration, and output path. 7. Return the result as a playable audio artifact. ## Preview-first For unspecified song length, generate a short demo first instead of a full song. Use 8-15 seconds for sung demos and 10-20 seconds for instrumental BGM unless the user explicitly asks for a longer duration. For singing, keep first-pass lyrics compact: one hook plus one short verse is usually enough to test vocal language, accent, melody feel, and quota behavior. If the user asks for a complete song, generate or present the full lyrics, but send a short demo to `song_generate` first and then scale up after approval or after confirming key quota. If the provider returns `quota_retry.strategy=short_preview`, treat that as a successful short demo, not as a failed generation. ## Tool-result handling - If `music_generate` or `song_generate` returns `status=ok`, put the playable artifact/path first, then duration, style, and rights summary. - If `song_generate` returns `quota_retry.strategy=short_preview`, say a short demo was generated because the full request exceeded the API key quota. - If a provider error occurs, quote the `note` and distinguish account credits, API key quota, feature gating, duration, format, content policy, and network failures. - If no tool was called, do not speculate about credits or availability. Call the relevant tool or say generation was not attempted. ## Availability and credits handling - Do not claim credits are insufficient unless `music_generate`, `song_generate`, or `audio_provider_capabilities(probe_live=true)` returned that exact provider error. - If a provider error occurs, quote the tool's `note` / provider message and distinguish account credits from feature gating, duration limits, output format restrictions, API key quota, and content-policy rejection. - If `song_generate` returns `status=ok` with `quota_retry.strategy=short_preview`, the song was generated as a shorter playable demo after an API key quota retry. Do not say generation failed; explain that a short version was produced and include the playable artifact. - If no tool was called, do not speculate about credits. Call the relevant audio tool first or say the generation was not attempted. ## Copyright and authorization guard - Copyright / 版权: do not reproduce protected lyrics, melodies, arrangements, backing tracks, or recognizable artist styles. - 授权: if the user gives existing lyrics, ask for or record that they own or have permission to use them. - Public figure policy: do not request a singer, actor, public figure, or band imitation. Use generic traits like "clear warm Mandarin pop vocal". - If the user asks for a cover, explain that this skill can create an original song inspired by non-identifying mood/tempo/instrumentation instead. ## Locale and accent singing notes For singing, first identify the target language and desired accent. The final vocal should use a locale-appropriate accent. Lyrics, vocal style, and pronunciation notes should match that target. - Chinese lyrics: keep lyrics in natural Chinese and avoid unnecessary translation through OpenRouter. Use 普通话 phrasing unless the user asks for a dialect. - English lyrics: preserve requested accent or locale, such as en-US, en-GB, en-AU, en-IN, or en-SG. - Spanish/Portuguese/French and other languages: keep regional variants explicit when requested. - If the vocal sounds like the wrong accent, simplify lyrics, reduce code-switching, add punctuation at phrase boundaries, and try a shorter sample before generating the full song. ## Output contract Return: - provider - tool used - duration - output path - playable audio artifact status - copyright/rights summary - target language / locale assumption - whether this is a preview or full asset
Submit audio or video for multilingual dubbing, poll status, and download dubbed audio. Use when the user asks for dubbing, 多语言配音, 视频翻译配音, 译制片, or wants a source clip dubbed into another language.
Generate a structured short-video shooting script from a topic. Emits a strict, machine-parseable shot list (3 shots by default) with image prompt + video prompt + voiceover + on-screen text per shot. Trigger when the user asks for a video script, 分镜, 短视频文案, AI视频, 短剧脚本, or wants visual prompts ready for image/video generation.
Use when the user asks to schedule recurring tasks, one-off reminders, timers, or cron-style jobs through the OpenSquilla cron tool.
Multi-round research with explicit methodology, evidence tracking, and citation-tagged synthesis. Trigger on 'deep dive', 'research report', 'literature review', 'investigate X across sources', 'multi-round investigation'. Distinct from the `summarize` skill, which is a single-pass condensation; this skill maintains a state file across iterations, tracks coverage, and produces a long-form report with per-claim citations. Three execution stages: plan (scope into sub-questions), iterate (record evidence per round), compile (synthesize report). The skill itself does not fetch the web — it tells the host agent which fetches to perform via OpenSquilla's existing web tools, and records what comes back.
Read, edit, or create Microsoft Word `.docx` files. Trigger this skill whenever the user mentions a Word document, .docx file, contract, report, brief, memo, or asks to extract text, modify an existing doc, generate one from a brief, or audit tracked changes. Three execution paths: text-and-structure extraction, in-place edit-by-run (preserves styles), and create-from-scratch with python-docx. Falls back to OOXML unzip-and-patch for layout work python-docx cannot reach.
Capture the current git diff (staged, working-tree, or staged file list) as text. Direct shell call for workflows that need repository diffs without an LLM agent loop.
GitHub operations via `gh` CLI: issues, PRs, CI runs, code review, API queries. Use when: (1) checking PR status or CI, (2) creating/commenting on issues, (3) listing/filtering PRs or issues, (4) viewing run logs. NOT for: complex web UI interactions requiring manual browser flows (use browser tooling when available), bulk operations across many repos (script with gh api), or when gh auth is not configured.
Query the per-turn DecisionEntry log for skill co-occurrence patterns, meta-skill usage stats, and the router fixture corpus. Returns a JSON summary suitable for downstream LLM consumption. Used by meta-skill-creator's harvest step but also useful standalone for 'which skills did I use most this week?'