Skip to main content
ClaudeWave
Skill82 repo starsupdated 2d ago

wjs-translating-subtitles

Use when the user has an SRT (or transcript text) in one language and wants it translated to another, with punctuation-bounded re-segmentation so cues end at real sentence breaks. Simplified Chinese (zh-CN) and English (en) are first-class targets; other targets follow the same rules. Outputs a target-language SRT or bilingual SRT — no audio, no burn-in. Triggers — "翻译字幕", "翻成中文", "translate this SRT", "中英双语字幕", "把这个 SRT 翻译成 X", "bilingual subtitles".

Install in Claude Code
Copy
git clone --depth 1 https://github.com/jianshuo/claude-skills /tmp/wjs-translating-subtitles && cp -r /tmp/wjs-translating-subtitles/wjs-translating-subtitles ~/.claude/skills/wjs-translating-subtitles
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# wjs-translating-subtitles

Source-language SRT in → target-language (or bilingual) SRT out. **This skill is text-only.** Burn-in lives in `/wjs-burning-subtitles`; voice dub in `/wjs-dubbing-video`.

## When to use

- User has an SRT in language A and wants it in language B.
- User pasted a transcript (with or without timestamps) and wants a translation that becomes an SRT.
- User has an SRT but cues end mid-sentence — this skill's re-segmentation step fixes that.

## When NOT to use

- No source-language SRT yet → run `/wjs-transcribing-audio` first.
- User wants burned-in subtitles → finish translation here, then `/wjs-burning-subtitles`.
- User wants voice dub → finish translation here, then `/wjs-dubbing-video`.

## Pick the target

Resolve target from the user's phrasing once, don't re-ask:

- "翻成中文 / 中文字幕 / 中文配音" → `zh-CN`.
- "translate to English / English subs / English dub" → `en`.
- "bilingual" / "双语" → produce both `.<source>.srt` and `.<target>.srt` (and optionally a combined `.<source>-<target>.srt`).
- Ambiguous → default to whichever the user has historically chosen in the project.

Simplified Chinese and English are fully validated. Other targets (Japanese, Korean, French, etc.) work via the same rules; the bottleneck is TTS-voice availability if dubbing follows — see `/wjs-dubbing-video` before promising.

## Shared translation principles

- Prioritize meaning over literal wording.
- Use concise subtitle-style language — viewers read at ~3 wps for Chinese, ~3–4 wps for English; lines that exceed that go off-screen before they can be read.
- Preserve the tone of the speaker. Casual source → casual target; formal source → formal target.
- Do not over-translate names, brands, cultural references, or technical terms.
- Keep numbers, dates, names, and places accurate.
- If a phrase has no exact equivalent, translate the meaning naturally. No literal/word-for-word constructions.
- Avoid stiff, machine-translated output.

## Translating into Simplified Chinese (zh-CN)

- Use natural spoken Mandarin for casual speech, formal Mandarin for formal speech.
- Use Simplified characters only (do NOT use Traditional Hanzi unless the user explicitly asks).
- Subtitle lines should be roughly **15 Chinese characters** or fewer per line, max 2 lines per cue (3 only when unavoidable for very long cues).
- Use Chinese punctuation: 「,」「。」「;」「:」「、」「——」. Never mix English commas/periods into Chinese subtitles.
- **Minimize filler demonstratives 「这」「那」「这个」「那个」「那份」「那种」「那里」「那样」.** Spanish-to-Chinese (and English-to-Chinese) MT routinely inserts these because the source has overt demonstratives that Chinese usually drops. Examples:
  - "这把我们带入二元世界的载体" → "把我们带入二元的载体"
  - "运用那份能量" → "运用这股能量" if needed, or just "运用能量"
  - "正是在这合一里" → "正是在合一中"
  - "像罪人那样翻滚" → "像罪人翻滚" / "像罪人般翻滚"
  - "那份精微的觉知" → "精微的觉知"
  Keep them only when they carry real meaning (deixis, contrast, or fixed phrase like spiritual "我就是那" / "tat tvam asi"). Default is to delete; add back only if the sentence becomes ambiguous.

Examples (Spanish → Chinese):

```text
Spanish: No pasa nada.            → Chinese: 没关系。
Spanish: Vamos a ver qué pasa.    → Chinese: 我们看看会发生什么。
Spanish: Me parece una locura.    → Chinese: 我觉得这太疯狂了。
Spanish: ¿Qué quieres decir?      → Chinese: 你是什么意思?
Spanish: La verdad es que no lo esperaba.
                                  → Chinese: 说实话,我没想到会这样。
```

## Translating into English (en)

- Use natural conversational English. Avoid translationese ("It is precisely through entering the body…" → "It's by entering the body…").
- Lines should be roughly **40–42 characters** or fewer (about 7–9 words), max 2 lines per cue. Hard cap 50 chars per line.
- Use ASCII punctuation: `,` `.` `;` `:` `—` (em-dash). Avoid Unicode curly quotes — keeps `.srt` portable.
- For contemplative/spiritual content, prefer plain words over Latinate jargon: "presence" over "manifestation," "wholeness" over "totality," "wake up" over "awaken to consciousness."

Examples (Spanish → English):

```text
Spanish: No pasa nada.            → English: It's nothing.
Spanish: Vamos a ver qué pasa.    → English: Let's see what happens.
Spanish: Me parece una locura.    → English: This feels crazy to me.
Spanish: ¿Qué quieres decir?      → English: What do you mean?
Spanish: La verdad es que no lo esperaba.
                                  → English: Honestly, I wasn't expecting this.
```

## Re-segment at punctuation boundaries (mandatory)

Whisper segments by silence/breath, not grammar. The result almost always has cues that **end mid-sentence** (e.g., "...es una forma de aterrizar," next cue starts "el espíritu en el cuerpo..."). Any TTS that processes one cue at a time will then insert an unnatural pause exactly where the original speaker did not. The fix is mandatory before dubbing — and improves on-screen reading too.

Punctuation set differs:

- Chinese cues must end at `,` `。` `;` `:` `——` or `、`.
- English cues must end at `,` `.` `;` `:` `—` (em-dash) or, in practice for subtitles, occasionally a single dash. Never end an English cue on a comma-less clause break, and never split inside a phrase like "kind of" or "in order to".

Rules:

- **Every cue must end at a real punctuation mark.** Never let a cue end on a noun, verb, conjunction, or article that flows into the next cue.
- It is fine (and often necessary) to **split** a single source cue into 2–4 shorter cues, with timestamps interpolated by character position within the original cue's duration.
- It is fine to **merge** the tail of one source cue with the head of the next when they form one clause — the merged cue inherits the start of the first and the end of the second.
- Target 3–8 seconds per cue. Cues shorter than ~1.5s feel choppy on screen; cues longer than ~10s usually contain a missed punctuation break.

A typical 2–3 minute talk yields roughly 25–40 punct-bounded cues from 12–18 raw source cues. Don't try to keep the original cue count.

When TTS dubbing follows: the punctua
skill-quality-reviewerSubagent

Repo-wide drift detector for the wjs-* Claude Code skills in this marketplace. Sweeps every SKILL.md, scores it against the repo's own conventions (V-ing naming, trigger-phrase density, companion files, description shape), and returns a grouped punch list ordered by severity. Read-only — never edits files. Use before pushing a batch of skill changes, or whenever you wonder "are these skills still internally consistent?

wangjianshuo-perspectiveSkill

|

wjs-auditing-projectSkill

Use when the user asks to audit what's wrong with a project, "make it right", "看看项目出了什么问题", "为什么用户的需求还没上线", "为什么没提交App Store", "为什么没新build", or wants a holistic state-of-the-project check covering unmerged branches, stalled PRs, failed GitHub Actions, stale builds, plan drift (TODOS.md / ROADMAP), unreleased commits, and log errors. Runs read-only investigation, presents a grouped checklist, fixes only after explicit user confirmation. Aware of the Cathier iOS app workflow (Xcode + fastlane + auto-merge @claude PRs from in-app feedback).

wjs-burning-subtitlesSkill

Use when the user has a video + an SRT and wants the subtitles either burned into the pixels (libass, always-visible) or soft-muxed as a togglable track. Also handles the final composite step for the localization pipeline — burn subs, mix a dub track, and keep the original audio as a low-volume bed, all in ONE ffmpeg encode (no cascade). Verifies libass availability and auto-downloads a static evermeet ffmpeg build when Homebrew's stripped binary lacks it. Triggers — "烧字幕", "硬字幕", "burn subtitles", "burn-in subs", "embed subtitle", "soft mux SRT", "把字幕烧进视频", "做最终合成".

wjs-cleaning-spamSkill

Use when the user complains about spam on his X/Twitter posts — 同城面付 / 寻固炮 / 线下上门 / 免费破处 这类引流号在他推文下刷的 emoji 垃圾回复 — and wants them removed. Covers the last 7 days (X recent-search window). Triggers — "把这些spam删掉", "清理X垃圾回复", "推文下面好多引流号", "clean spam replies", "/wjs-cleaning-spam".

wjs-converting-text-to-videoSkill

Use when the user wants a 王建硕-style WeChat article (article.md) turned into a narrated short MP4 video — TTS voiceover via 火山引擎 Volcano TTS, HyperFrames CSS/GSAP animation per scene, subtle SFX, abstract watercolor background, full pipeline rendering to 1080×1920 portrait MP4 (30-90s). Triggers — "把这篇文章做成视频", "做一个解说视频", "讲解视频", "/wjs-converting-text-to-video".

wjs-converting-wp-to-hugoSkill

Use when migrating a WordPress site to a Hugo static site on GitHub Pages from a WXR export (.xml) plus the wp-content/uploads folder — preserving /archives/<id>/ URLs, localizing images, and deploying via GitHub Actions. Triggers — "把 WordPress 迁成 Hugo", "wordpress 转静态站", "migrate WordPress to Hugo", "WXR to Hugo", "publish WordPress to GitHub Pages", "/wjs-converting-wp-to-hugo".

wjs-dubbing-videoSkill

Use when the user has a video + a target-language SRT and wants the video to actually speak that language — generates a time-aligned TTS voice dub. Routes by voice ID — Volcano (豆包) TTS for Chinese, edge-tts neural for any language. Defaults to one voice (single-speaker); opt-in multi-speaker via visual diarization. Outputs `*_<lang>_dub.mp4` with the dub audio in place of the original. Final mixing (audio bed + burn-in) is handed off to `/wjs-burning-subtitles`. Triggers — "配音", "中文配音", "Chinese dub", "voice over this", "dub the video", "TTS this SRT", "different voice for each speaker".