Skip to main content
ClaudeWave
Skill71 estrellas del repoactualizado yesterday

assemblyai

>-

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/TerminalSkills/skills /tmp/assemblyai && cp -r /tmp/assemblyai/skills/assemblyai ~/.claude/skills/assemblyai
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

# AssemblyAI

## Overview

AssemblyAI provides best-in-class speech recognition plus an intelligence layer: speaker diarization, sentiment analysis, auto chapters, content moderation, and LeMUR (LLM-powered Q&A on audio). Use it to turn audio/video files into structured, queryable data.

## Setup

```bash
pip install assemblyai python-dotenv
export ASSEMBLYAI_API_KEY="your_api_key_here"
```

## Core Concepts

- **Transcript**: The async job that converts audio → text. Submit a URL or file, poll for completion.
- **Audio Intelligence**: Optional enrichments added to the transcript request (diarization, sentiment, chapters, etc.).
- **LeMUR**: Apply LLMs to your transcript — summarize, answer questions, extract structured data.
- **Real-time**: Stream audio via WebSocket for live transcription.

## Instructions

### Step 1: Initialize the client

```python
import assemblyai as aai
import os

aai.settings.api_key = os.environ["ASSEMBLYAI_API_KEY"]
```

### Step 2: Transcribe a file (basic)

```python
def transcribe(audio_source: str) -> aai.Transcript:
    """
    audio_source: URL (https://...) or local file path.
    Returns the completed Transcript object.
    """
    transcriber = aai.Transcriber()
    transcript = transcriber.transcribe(audio_source)

    if transcript.status == aai.TranscriptStatus.error:
        raise RuntimeError(f"Transcription error: {transcript.error}")

    print(f"Transcript ID: {transcript.id}")
    print(f"Text (first 300 chars): {transcript.text[:300]}...")
    return transcript

t = transcribe("https://assembly.ai/sports_injuries.mp3")
print(t.text)
```

### Step 3: Transcribe with full audio intelligence

```python
def transcribe_rich(audio_source: str) -> aai.Transcript:
    """Transcribe with speaker labels, sentiment, chapters, and content safety."""
    config = aai.TranscriptionConfig(
        speaker_labels=True,         # Who said what
        sentiment_analysis=True,     # Positive/negative/neutral per sentence
        auto_chapters=True,          # Generate chapter markers
        content_safety=True,         # Detect profanity, hate speech, etc.
        auto_highlights=True,        # Key phrases and topics
        entity_detection=True,       # People, places, organizations
        iab_categories=True,         # Topic taxonomy
        language_detection=True      # Detect language automatically
    )
    transcriber = aai.Transcriber()
    transcript = transcriber.transcribe(audio_source, config=config)

    if transcript.status == aai.TranscriptStatus.error:
        raise RuntimeError(transcript.error)
    return transcript

t = transcribe_rich("https://your-audio.com/podcast.mp3")

# Speaker diarization
print("\n--- Speakers ---")
for utt in t.utterances:
    print(f"[{utt.speaker}] {utt.text}")

# Chapters
print("\n--- Chapters ---")
for ch in t.chapters:
    start_min = ch.start // 60000
    print(f"[{start_min}m] {ch.headline}: {ch.summary}")

# Sentiment
print("\n--- Sentiment ---")
for s in t.sentiment_analysis[:5]:
    print(f"{s.sentiment.value}: {s.text[:80]}")

# Content safety
print("\n--- Content Safety ---")
for label, result in t.content_safety_labels.results.items():
    if result.status == "flagged":
        print(f"Flagged: {label} (confidence: {result.confidence:.2f})")
```

### Step 4: Real-time streaming transcription

```python
import assemblyai as aai
import pyaudio  # pip install pyaudio

def on_open(session_opened: aai.RealtimeSessionOpened):
    print(f"Session opened: {session_opened.session_id}")

def on_data(transcript: aai.RealtimeTranscript):
    if not transcript.text:
        return
    if isinstance(transcript, aai.RealtimeFinalTranscript):
        print(f"\n[FINAL] {transcript.text}")
    else:
        print(f"\r[partial] {transcript.text}", end="")

def on_error(error: aai.RealtimeError):
    print(f"Error: {error}")

def on_close():
    print("Session closed.")

def stream_microphone():
    """Stream microphone input to AssemblyAI for real-time transcription."""
    transcriber = aai.RealtimeTranscriber(
        sample_rate=16_000,
        on_data=on_data,
        on_error=on_error,
        on_open=on_open,
        on_close=on_close,
        end_utterance_silence_threshold=700
    )
    transcriber.connect()

    FRAMES_PER_BUFFER = 3200
    FORMAT = pyaudio.paInt16
    CHANNELS = 1
    RATE = 16_000

    p = pyaudio.PyAudio()
    stream = p.open(format=FORMAT, channels=CHANNELS, rate=RATE,
                    input=True, frames_per_buffer=FRAMES_PER_BUFFER)
    try:
        print("Recording... Press Ctrl+C to stop.")
        while True:
            data = stream.read(FRAMES_PER_BUFFER)
            transcriber.stream(data)
    except KeyboardInterrupt:
        pass
    finally:
        stream.stop_stream()
        stream.close()
        p.terminate()
        transcriber.close()

stream_microphone()
```

### Step 5: LeMUR — ask questions about audio

```python
def lemur_qa(transcript_id: str, questions: list[str]) -> list[dict]:
    """
    Ask LeMUR questions about a transcript.
    Returns list of {question, answer} dicts.
    """
    transcript = aai.Transcript.get_by_id(transcript_id)
    questions_answers = transcript.lemur.question_answer(
        questions=[
            aai.LemurQuestion(question=q, answer_format="concise")
            for q in questions
        ],
        final_model=aai.LemurModel.claude3_5_sonnet
    )
    results = []
    for qa in questions_answers.response:
        print(f"Q: {qa.question}\nA: {qa.answer}\n")
        results.append({"question": qa.question, "answer": qa.answer})
    return results

# Use LeMUR to extract structured insights
lemur_qa(t.id, [
    "What are the main topics discussed?",
    "List any action items or decisions made.",
    "What is the overall sentiment of the conversation?"
])
```

### Step 6: LeMUR summarization

```python
def lemur_summarize(transcript_id: str, context: str = "") -> str:
    """Generate a concise summary of a transcript."""
    trans