Skill318 repo starsupdated 1mo ago
elevenlabs
This ElevenLabs skill converts text and documents into high-quality audio through two modes: single-voice narration for documents and two-host conversational podcast generation. Use it when users request text-to-speech conversion, document narration, or podcast creation from written content, with automatic text chunking, voice continuity, and audio concatenation handled through the ElevenLabs API.
Install in Claude Code
Copygit clone --depth 1 https://github.com/sanjay3290/ai-skills /tmp/elevenlabs && cp -r /tmp/elevenlabs/skills/elevenlabs ~/.claude/skills/elevenlabsThen start a new Claude Code session; the skill loads automatically.
Definition
SKILL.md
# ElevenLabs - Text-to-Speech & Podcast Skill
## Overview
This skill converts text and documents into high-quality audio using ElevenLabs TTS API. It supports two modes: single-voice narration and two-host conversational podcast generation.
## When to Use This Skill
Activate when the user mentions:
- "create podcast", "generate podcast", "podcast from document"
- "narrate document", "narrate this file", "read aloud"
- "text to speech", "TTS", "convert to audio"
- "audio from document", "audio version of"
## Setup
Config at `skills/elevenlabs/config.json`:
```json
{
"api_key": "your-elevenlabs-api-key",
"default_voice": "JBFqnCBsd6RMkjVDRZzb",
"default_model": "eleven_multilingual_v2",
"podcast_voice1": "JBFqnCBsd6RMkjVDRZzb",
"podcast_voice2": "EXAVITQu4vr4xnSDxMaL"
}
```
Only `api_key` is required. Or set `ELEVENLABS_API_KEY` env var.
Dependencies: `pip install PyPDF2 python-docx` (only needed for PDF/DOCX files).
Requires `ffmpeg` for multi-chunk narration and podcasts.
## Commands
### List Voices
```bash
python skills/elevenlabs/scripts/elevenlabs.py voices
python skills/elevenlabs/scripts/elevenlabs.py voices --json
```
Use this to find voice IDs for the user.
### Single-Voice TTS
```bash
# From text
python skills/elevenlabs/scripts/elevenlabs.py tts --text "Hello world" --output ~/Downloads/hello.mp3
# From document
python skills/elevenlabs/scripts/elevenlabs.py tts --file /path/to/doc.pdf --output ~/Downloads/narration.mp3
# With specific voice
python skills/elevenlabs/scripts/elevenlabs.py tts --file doc.md --voice VOICE_ID --output out.mp3
```
The script handles text extraction, chunking at sentence boundaries (~4000 chars), TTS per chunk with voice continuity, and ffmpeg concatenation automatically.
### Podcast Generation
Podcast mode requires a JSON script file with conversation segments:
```json
[
{"speaker": "host1", "text": "Welcome to our podcast! Today we're diving into..."},
{"speaker": "host2", "text": "That's right! I found the section on..."},
{"speaker": "host1", "text": "Let's break that down..."}
]
```
```bash
python skills/elevenlabs/scripts/elevenlabs.py podcast --script /tmp/script.json --voice1 ID1 --voice2 ID2 --output ~/Downloads/podcast.mp3
```
## Podcast Workflow (for Claude)
When the user asks to create a podcast from a document:
1. **Extract the document text**:
```bash
python skills/elevenlabs/scripts/extract.py /path/to/document.pdf
```
2. **Generate a two-host conversation script** from the extracted text. Follow these guidelines:
- Write as a natural, engaging discussion between two hosts
- Host 1 typically leads/introduces topics, Host 2 adds analysis and reactions
- Start with a brief intro welcoming listeners and stating the topic
- End with a summary/outro
- Keep each turn under 3000 characters
- Vary turn lengths - mix short reactions with longer explanations
- Use conversational language: "That's a great point", "What I found interesting was..."
- Reference specific details from the source document
- Avoid reading the document verbatim - discuss and interpret it
3. **Write the script** as a JSON array to a temp file:
```python
# Write to /tmp/podcast_script.json
[
{"speaker": "host1", "text": "Welcome to today's episode..."},
{"speaker": "host2", "text": "Thanks for having me..."},
...
]
```
4. **Generate the podcast**:
```bash
python skills/elevenlabs/scripts/elevenlabs.py podcast --script /tmp/podcast_script.json --output ~/Downloads/podcast.mp3
```
5. **Clean up** the temp script file.
## Tips
- Run `voices` first to let the user pick voices they like
- For podcasts, suggest voice pairs with contrasting qualities (e.g., one deep, one bright)
- Default output to `~/Downloads/` unless the user specifies otherwise
- For large documents, warn the user about character usage on their ElevenLabs planMore from this repository
atlassianSkill
|
azure-devopsSkill
|
deep-researchSkill
Execute autonomous multi-step research using Google Gemini Deep Research Agent. Use for: market analysis, competitive landscaping, literature reviews, technical research, due diligence. Takes 2-10 minutes but produces detailed, cited reports. Costs $2-5 per task.
gmailSkill
|
google-calendarSkill
|
google-chatSkill
|
google-docsSkill
|
google-driveSkill
|