supertone-mcp

Name: supertone-inc/supertone-mcp
Author: supertone-inc

Ver en GitHub

MCP ServersRegistry oficial3 estrellas1 forks● PythonActualizado today

ClaudeWave Trust Score

44/100

! Caution

Passed

✓Actively maintained (<30d)

Flags

!No standard license detected
!No description

Last scanned: 6/11/2026

Install in Claude Code / Claude Desktop

Method: UVX (Python) · supertone-mcp

Claude Code CLI

claude mcp add supertone-mcp -- uvx supertone-mcp

claude_desktop_config.json (Claude Desktop)

{
  "mcpServers": {
    "supertone-mcp": {
      "command": "uvx",
      "args": ["supertone-mcp"]
    }
  }
}

1. Run the command above in your terminal (Claude Code), or paste the JSON config into claude_desktop_config.json (Claude Desktop).

2. Replace any <placeholder> values with your API keys or paths.

3. Restart Claude. The MCP server and its tools appear automatically.

Casos de uso

Media Dev Tools AI / ML

Sobre el repo

Resumen de MCP Servers

# supertone-mcp

<!-- mcp-name: io.github.supertone-inc/supertone-mcp -->

A **composable MCP toolkit** for the [Supertone](https://supertone.ai) TTS API. Rather than a single "speak this text" command, it exposes Supertone's SDK as a set of building-block tools — synthesis, voice discovery, preview, duration/credit prediction, usage tracking, and full voice-cloning CRUD — that an LLM assembles to fulfill a request. Works in Claude Desktop, Cursor, or any MCP-compatible client.

[![supertone-inc/supertone-mcp MCP server](https://glama.ai/mcp/servers/supertone-inc/supertone-mcp/badges/score.svg)](https://glama.ai/mcp/servers/supertone-inc/supertone-mcp)

Covers Korean, English, Japanese, and **31 languages** total. Speed (0.5x–2.0x), pitch shift (-24 to +24 semitones), emotion styles, per-call output mode, streaming, and model selection.

## Features

**Synthesis**
- **`text_to_speech`** — Convert text to audio. Per-call control of `output_mode` (files / resources / both), `autoplay`, `streaming`, `model`, plus `include_phonemes` / `normalized_text`. Long text is auto-chunked by the SDK.
- **`predict_duration`** — Estimate audio length (and credit cost) without synthesizing.

**Voice discovery (preset)**
- **`search_voice`** — Filter the catalog by language, gender, age, use_case, style, model, name, or description.
- **`get_voice`** — Full detail for one voice.
- **`preview_voice`** — Sample audio URLs for a voice (filterable by language/style/model).

**Custom voice cloning**
- **`clone_voice`** — Create a cloned voice from a local WAV/MP3 (≤3MB).
- **`search_custom_voice`** — List/filter cloned voices.
- **`get_custom_voice`** — Full detail for one cloned voice.
- **`edit_custom_voice`** — Update name and/or description.
- **`delete_custom_voice`** — Permanently delete (irreversible).

**Audio assembly**
- **`merge_audio_files`** — Concatenate two or more local audio files (mp3/wav) into one via a bundled ffmpeg. Supports plain concat, silence gaps between clips (`gap_ms`), or crossfade blending (`crossfade_ms`). Output format auto-detected (mixed → mp3) or forced via `output_format`. No system ffmpeg required.

**Usage & credits**
- **`get_credit_balance`** — Remaining credits.
- **`get_usage_history`** — Usage over a time window.
- **`get_voice_usage`** — Usage for a specific voice.

## Breaking changes & migration (0.2.0)

0.2.0 moves behavior control **out of environment variables and into per-call tool parameters** — so the LLM decides per request, not the server config.

| Before (env var) | After (per-call parameter) | Note |
|------------------|----------------------------|------|
| `SUPERTONE_MCP_OUTPUT_MODE=files\|resources\|both` | `text_to_speech(output_mode=...)` | Default still `files` |
| `SUPERTONE_MCP_AUTOPLAY=true` | `text_to_speech(autoplay=...)` | **Default changed `true` → `false`** (playback is now explicit) |
| *(always streamed)* | `text_to_speech(streaming=...)` | **New, default `false`** (one-shot). `streaming=true` requires `model="sona_speech_1"` |

Other changes:
- **Default model** changed `sona_speech_1` → **`sona_speech_2_flash`**.
- **`list_voices` was removed** (since the discovery release) and replaced by `search_voice` — call it with no arguments to reproduce the old "list everything" behavior.
- No more hard 300-character limit — longer text is auto-chunked by the SDK (credit/latency scale with length).

If you previously set `SUPERTONE_MCP_OUTPUT_MODE` or `SUPERTONE_MCP_AUTOPLAY`, remove them from your client config and pass `output_mode` / `autoplay` per call instead. (The server prints a one-time stderr notice if it sees the removed vars.)

## Installation

```bash
# Using uvx (recommended)
uvx supertone-mcp

# Using pip
pip install supertone-mcp
```

## Configuration

### Claude Desktop

Add to `claude_desktop_config.json`:

```json
{
  "mcpServers": {
    "supertone-tts": {
      "command": "uvx",
      "args": ["supertone-mcp"],
      "env": {
        "SUPERTONE_API_KEY": "your-api-key-here"
      }
    }
  }
}
```

### Cursor

Add to your Cursor MCP settings (same JSON shape as above).

## Environment Variables

Only authentication and stable defaults are configured via the environment — all behavior is controlled per call.

| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `SUPERTONE_API_KEY` | Yes | — | Your Supertone API key |
| `SUPERTONE_MCP_VOICE_ID` | No | preset voice (Aiden, multilingual) | Default `voice_id` for `text_to_speech` / `predict_duration` (override per call) |
| `SUPERTONE_OUTPUT_DIR` | No | `~/supertone-tts-output/` | Directory where audio files are saved (used by `output_mode=files`/`both`) |

> Removed in 0.2.0: `SUPERTONE_MCP_OUTPUT_MODE` and `SUPERTONE_MCP_AUTOPLAY` — see [Migration](#breaking-changes--migration-020).

### Output modes (`text_to_speech` `output_mode`)

| Mode | Returns | Use when |
|------|---------|----------|
| `files` *(default)* | Plain text with the saved file path + metadata | You want the file on disk |
| `resources` | MCP `AudioContent` + `TextContent` (no file written) | The client renders audio inline (e.g., Claude.ai chat) |
| `both` | File on disk **and** `AudioContent`/`TextContent` | You want both — preview inline, keep the file |

## Usage Examples

The MCP client routes natural-language requests across these tools — the value of the toolkit is **composition**: the LLM chains several tools to satisfy one request.

### Example 1 — Discover → preview → estimate cost → synthesize

> "Find a calm Korean female voice, let me hear a sample, check the cost, then make this announcement as an mp3."

The LLM assembles:
```
search_voice(language="ko", gender="female", style="neutral")   # find candidates
  → preview_voice(voice_id)                                       # sample URLs to confirm the voice
  → predict_duration(text, voice_id) + get_credit_balance()       # gauge cost before spending
  → text_to_speech(text, voice_id, output_format="mp3",
                   output_mode="files")                           # synthesize
```

### Example 2 — Clone my voice → use it right away

> "Make a cloned voice from ~/recordings/sample.wav named MyVoice, then read this greeting with it and play it for me."

The LLM assembles:
```
clone_voice(name="MyVoice", audio_path="~/recordings/sample.wav")   # create the cloned voice
  → get_custom_voice(voice_id)                                       # confirm it was created
  → text_to_speech(text, voice_id=<cloned>, autoplay=true)           # synthesize, then play immediately
```

> `autoplay` is a per-call parameter (default `false`), so playback happens only when explicitly requested.

## Tool Parameters

### `text_to_speech`

| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `text` | string | Yes | — | Text to convert (long text is auto-chunked by the SDK) |
| `voice_id` | string | No | env or preset | Voice identifier (browse via `search_voice`) |
| `language` | string | No | `ko` | Language code — one of 31 (`ko`, `en`, `ja`, …) |
| `output_format` | string | No | `mp3` | `mp3` or `wav` |
| `model` | string | No | `sona_speech_2_flash` | `sona_speech_1`, `sona_speech_2`, `sona_speech_2_flash`, `sona_speech_2t`, `sona_speech_3t`, `supertonic_api_1`, `supertonic_api_3` |
| `speed` | float | No | `1.0` | 0.5–2.0 |
| `pitch_shift` | int | No | `0` | -24 to +24 semitones |
| `style` | string | No | — | Emotion style (varies by voice) |
| `output_mode` | string | No | `files` | `files`, `resources`, or `both` (see [Output modes](#output-modes-text_to_speech-output_mode)) |
| `autoplay` | bool | No | `false` | Play the audio locally after synthesis (macOS `afplay`) |
| `streaming` | bool | No | `false` | Stream synthesis. Only supported by `model="sona_speech_1"` |
| `include_phonemes` | bool | No | `false` | Return phoneme timing data alongside the audio |
| `normalized_text` | string | No | — | Pre-normalized text (only used by `sona_speech_2` / `sona_speech_2_flash`) |

### `predict_duration`

Same core parameter schema as `text_to_speech` (long text auto-chunked). Returns `"Predicted duration: 2.34s (credit usage is proportional to duration)."`.

### `search_voice`

All parameters optional. With no filters → full catalog. With any filter → first response line is `Filters applied: ...`.

| Parameter | Type | Description |
|-----------|------|-------------|
| `language` | string | e.g., `ko`, `en`, `ja` |
| `gender` | string | e.g., `male`, `female` |
| `age` | string | e.g., `young_adult`, `child` |
| `use_case` | string | e.g., `narration`, `advertisement` |
| `style` | string | e.g., `neutral`, `happy` |
| `model` | string | e.g., `sona_speech_2_flash` |
| `name` | string | partial match |
| `description` | string | partial match |

### `get_voice` / `preview_voice`

| Tool | Required | Optional |
|------|----------|----------|
| `get_voice` | `voice_id` | — |
| `preview_voice` | `voice_id` | `language`, `style`, `model` (filter samples) |

### `clone_voice`

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `name` | string | Yes | Display name (non-empty) |
| `audio_path` | string | Yes | Local WAV or MP3 path (≤3MB). Supports `~` expansion |
| `description` | string | No | Optional note |

### Custom voice CRUD

| Tool | Required | Optional |
|------|----------|----------|
| `search_custom_voice` | — | `name`, `description` (partial match) |
| `get_custom_voice` | `voice_id` | — |
| `edit_custom_voice` | `voice_id` | `name`, `description` (at least one required) |
| `delete_custom_voice` | `voice_id` | — *(IRREVERSIBLE)* |

### Usage & credits

| Tool | Required | Optional |
|------|----------|----------|
| `get_credit_balance` | — | — |
| `get_usage_history` | — | — (reports a recent default window) |
| `get_voice_usage` | `voice_id` | — |

### `merge_audio_files`

| Parameter | Type | Required | Description |
|----------

Preguntas frecuentes

Lo que la gente pregunta sobre supertone-mcp

¿Qué es supertone-inc/supertone-mcp?

supertone-inc/supertone-mcp es mcp servers para el ecosistema de Claude AI con 3 estrellas en GitHub.

¿Cómo se instala supertone-mcp?

Puedes instalar supertone-mcp clonando el repositorio (https://github.com/supertone-inc/supertone-mcp) o siguiendo las instrucciones del README en GitHub. ClaudeWave también te ofrece bloques de instalación rápida en esta misma página.

¿Es seguro usar supertone-inc/supertone-mcp?

Nuestro agente de seguridad ha analizado supertone-inc/supertone-mcp y le ha asignado un Trust Score de 44/100 (tier: Caution). Revisa el desglose completo de comprobaciones superadas y flags en esta página.

¿Quién mantiene supertone-inc/supertone-mcp?

supertone-inc/supertone-mcp es mantenido por supertone-inc. La última actividad registrada en GitHub es de today, con 2 issues abiertos.

¿Hay alternativas a supertone-mcp?

Sí. En ClaudeWave puedes explorar mcp servers similares en /categories/mcp, ordenados por popularidad o actividad reciente.

Deploy en 1 click

Despliega supertone-mcp en tu cloud

Lleva este repo a producción en minutos. Cada plataforma genera su propio entorno con variables de entorno editables.

Vercel Railway Render

Badge embebible

¿Mantienes este repo? Añade un badge a tu README

Pega el badge en tu README de GitHub para mostrar que está auditado por ClaudeWave. Cada badge enlaza de vuelta a esta página y muestra el Trust Score actual.

Markdown (README)

[![Featured on ClaudeWave](https://claudewave.com/api/badge/supertone-inc-supertone-mcp)](https://claudewave.com/repo/supertone-inc-supertone-mcp)

HTML

<a href="https://claudewave.com/repo/supertone-inc-supertone-mcp"><img src="https://claudewave.com/api/badge/supertone-inc-supertone-mcp" alt="Featured on ClaudeWave: supertone-inc/supertone-mcp" width="320" height="64" /></a>

Relacionados

Más MCP Servers

Alternativas a supertone-mcp

n8n-io

n8n

today

Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.

192.8k58.6kTypeScript

MCP ServersaiapisInstall

open-webui

today

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

141.9k20.4kPython

MCP ServersaillmInstall

google-gemini

gemini-cli

today

An open-source AI agent that brings the power of Gemini directly into your terminal.

105.3k14.1kTypeScript

MCP Serversaiai-agentsInstall

netdata

today

The fastest path to AI-powered full stack observability, even for lean teams.

79.2k6.5kC

MCP ServersaialertingInstall

D4Vinci

Scrapling

9d ago

🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!

64.4k6.3kPython

MCP Serversaiai-scrapingInstall

sansan0

TrendRadar

4d ago

⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载，你的 AI 舆情监控助手与热点筛选工具！聚合多平台热点 + RSS 订阅，支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机，也支持接入 MCP 架构，赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ，数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。

59.5k24.7kPython

MCP ServersaibarkInstall