advanced-dubbing-studio
The advanced-dubbing-studio skill submits audio or video files for multilingual dubbing through provider backends, polls job status, and downloads the resulting dubbed audio. Use it when users request dubbing into another language, seeking either quick previews or full dubbed output with locale-specific accents and speaker preservation.
git clone --depth 1 https://github.com/opensquilla/opensquilla /tmp/advanced-dubbing-studio && cp -r /tmp/advanced-dubbing-studio/src/opensquilla/skills/bundled/advanced-dubbing-studio ~/.claude/skills/advanced-dubbing-studioSKILL.md
# advanced-dubbing-studio Runs provider-backed dubbing for local audio/video assets. OpenRouter can help translate, review style, or summarize job status, but the dubbing job itself must use `dubbing_generate`, `dubbing_status`, and `dubbing_download`. ## Request triage Before calling tools, extract these fields from the user request: - source media path and whether the file is local, intentional, and user-provided - source rights, speaker consent, and whether the clip contains third-party copyrighted material - source language, target language, target locale, desired accent, speaker count, and translation style - output expectation: quick preview, full dub, audio-only result, or follow-up video muxing outside this skill - whether the user needs polling now or only a submitted job ID OpenRouter can help translate or adapt lines, but it is not an audio provider and cannot perform the dubbing job itself. ## Required workflow 1. Verify the source file is local and intentionally provided. 2. Confirm the user has rights to dub the source media. 3. Identify source language, target language, target locale, and desired locale-appropriate accent. For Chinese target output, choose Mandarin/普通话 unless the user explicitly requests another dialect. 4. Call `audio_provider_capabilities` if dubbing availability is uncertain. 5. Submit with `dubbing_generate`. 6. Poll with `dubbing_status` or use `dubbing_download` with polling when appropriate. 7. Return the downloaded dubbed audio as a playable audio artifact. ## Preview-first For long videos, uncertain accents, or high-value assets, submit or prepare a short preview clip first when the workflow permits it. Use the preview to check translation style, target locale, speaker count, pacing, and whether the provider preserves speaker separation. If only the full source file is available, explain that the first run may need one retry for locale/accent tuning and keep the target locale explicit in the job notes. ## Tool-result handling - If `dubbing_generate` returns `status=ok`, return the job ID and tell the user whether download is pending or already being polled. - If `dubbing_status` is not ready, report the current status without claiming failure. - If `dubbing_download` returns audio, put the playable artifact/path first. - If any dubbing tool returns `not_available` or an error, quote the `note` and distinguish provider setup, feature gating, key/quota limits, source format, language support, and provider processing delay. ## Locale and accent constraints When dubbing, the target language is not enough; choose the target locale and accent as well: - Chinese: prefer 普通话 / Mainland Mandarin target settings unless the user asks for Cantonese, Taiwanese Mandarin, Sichuan dialect, etc. - English: preserve en-US, en-GB, en-AU, en-IN, en-SG, or any locale named by the user. - Spanish: distinguish es-ES and Latin American variants such as es-MX. - Portuguese: distinguish pt-BR and pt-PT. - French: distinguish fr-FR and fr-CA when requested. - Japanese/Korean/German/Italian/etc.: use native target-language voices rather than English-accented fallback voices. - Keep translated lines natural in Chinese, not word-for-word English order. - Avoid unnecessary English names or romanization unless the original requires it. - If the result sounds like the wrong accent, retry with shorter translated lines, clearer punctuation, and a voice native to the target locale. ## Rights and copyright guard - Copyright / 版权: do not dub movies, TV, anime, games, songs, audiobooks, paid courses, podcasts, or third-party videos unless the user states they have permission or the source is licensed for this use. - 授权: if the source contains identifiable private speakers, require consent for voice processing and translation. - Public figure policy: do not preserve, clone, or imitate public figure voices without provider-supported rights and explicit authorization. - For user-owned marketing/demo/training clips, keep the rights summary in the final response. ## Output contract Return: - dubbing job ID - final status - target language - target locale / accent assumption - output path - playable audio artifact status - rights/authorization summary
Generate a structured short-video shooting script from a topic. Emits a strict, machine-parseable shot list (3 shots by default) with image prompt + video prompt + voiceover + on-screen text per shot. Trigger when the user asks for a video script, 分镜, 短视频文案, AI视频, 短剧脚本, or wants visual prompts ready for image/video generation.
Use when the user asks to schedule recurring tasks, one-off reminders, timers, or cron-style jobs through the OpenSquilla cron tool.
Multi-round research with explicit methodology, evidence tracking, and citation-tagged synthesis. Trigger on 'deep dive', 'research report', 'literature review', 'investigate X across sources', 'multi-round investigation'. Distinct from the `summarize` skill, which is a single-pass condensation; this skill maintains a state file across iterations, tracks coverage, and produces a long-form report with per-claim citations. Three execution stages: plan (scope into sub-questions), iterate (record evidence per round), compile (synthesize report). The skill itself does not fetch the web — it tells the host agent which fetches to perform via OpenSquilla's existing web tools, and records what comes back.
Read, edit, or create Microsoft Word `.docx` files. Trigger this skill whenever the user mentions a Word document, .docx file, contract, report, brief, memo, or asks to extract text, modify an existing doc, generate one from a brief, or audit tracked changes. Three execution paths: text-and-structure extraction, in-place edit-by-run (preserves styles), and create-from-scratch with python-docx. Falls back to OOXML unzip-and-patch for layout work python-docx cannot reach.
Capture the current git diff (staged, working-tree, or staged file list) as text. Direct shell call for workflows that need repository diffs without an LLM agent loop.
GitHub operations via `gh` CLI: issues, PRs, CI runs, code review, API queries. Use when: (1) checking PR status or CI, (2) creating/commenting on issues, (3) listing/filtering PRs or issues, (4) viewing run logs. NOT for: complex web UI interactions requiring manual browser flows (use browser tooling when available), bulk operations across many repos (script with gh api), or when gh auth is not configured.
Query the per-turn DecisionEntry log for skill co-occurrence patterns, meta-skill usage stats, and the router fixture corpus. Returns a JSON summary suitable for downstream LLM consumption. Used by meta-skill-creator's harvest step but also useful standalone for 'which skills did I use most this week?'
Render HTML (with CSS) to a PDF file. Trigger when the user wants to export a styled report, invoice, label, or any HTML/Jinja-rendered page to PDF. Uses WeasyPrint, which supports a meaningful subset of CSS Paged Media (page size, margins, headers/footers, page-break-before/after). Optional dependency — install via `pip install opensquilla[document-extras]` or `uv add weasyprint` because WeasyPrint pulls in native libraries (Pango, Cairo, fontconfig) that need OS-level packages.