Audio, podcast, transcription, voice.
LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.
An MCP-based chatbot | 一个基于MCP的聊天机器人
Unofficial Python API and agentic skill for Google NotebookLM. Full programmatic access to NotebookLM's features—including capabilities the web UI doesn't expose—via Python, CLI, and AI agents like Claude Code, Codex, and OpenClaw.
Warcraft III Peon voice notifications (+ more!) for Claude Code, Codex, IDEs, and any AI agent. Stop babysitting your terminal. Employ a Peon today.
Multi-modal Generative Media Skills for AI Agents (Claude Code, Cursor, Gemini CLI). High-quality image, video, and audio generation powered by muapi.ai.
Voice-to-text dictation app with local (Nvidia Parakeet/Whisper) and cloud models (BYOK). Privacy-first and available cross-platform.
Official MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
The official ElevenLabs MCP server
小智ESP32的Java企业级管理平台,提供设备监控、音色定制、角色切换和对话记录管理的前后端及服务端一体化解决方案
🎙️ AI Dictation App - Open Source and Local-first ⚡ Type 3x faster, no keyboard needed. 🆓 Powered by open source models, works offline, fast and accurate.
Every meeting, every idea, every voice note — searchable by your AI. Open-source, privacy-first conversation memory layer.
Natural (2-way) voice conversations with Claude Code
Natively - Free open-source AI interview copilot & meeting assistant. The best Cluely alternative, Final Round AI alternative, and Interview Coder alternative. Real-time transcription, undetectable stealth mode, local RAG, BYOK. No subscriptions. No data breaches.
OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.
Official CLI for muapi.ai — generate images, videos & audio from the terminal. MCP server, 14 AI models, npm + pip installable.
AI-native video production toolkit for Claude Code
Googles NotebookLM but local
AI-powered video podcast creation skill for coding agents. Supports Bilibili & YouTube, multi-language (zh-CN/en-US), 6 TTS engines (Edge/Azure/ElevenLabs/OpenAI/Doubao/CosyVoice), 4K Remotion rendering.
Unreal Engine plugin for LLM/GenAI models & MCP UE5 server. OpenAI GPT-5, Deepseek R1, Claude Opus/Sonnet, Gemini 3, Grok 4, Alibaba Qwen, Kimi, ElevenLabs TTS, Inworld, OpenRouter, Groq, GLM, Ollama, Local, Meshy, Tripo, Hunyuan3D, Rodin, fal, Dashscope, Seedream. NPC AI, agentic, chat, 3D gen, TTS, multimodal, image gen. UnrealMCP/UnrealClaude
Open Source Voice Agent Platform
Make your meetings accessible to AI Agents
🎙️ Speak with AI - Run locally using Ollama, OpenAI, Anthropic or xAI - Speech uses SparkTTS, OpenAI, ElevenLabs, Kokoro, Typecast or xAI
Open-source AI meeting copilot - real-time transcription, echo cancellation, and AI assistance. Captures system audio + mic, cancels echo via WebRTC AEC3, transcribes with Deepgram, and gives you Claude/OpenAI help during meetings. Runs locally on macOS and Windows.
Give Claude the ability to watch and understand videos — Claude Code plugin with frame extraction and multimodal audio analysis
claude code hooks - adding voice on each hook
Lightweight MCP server for Spotify
One API for 20+ LLM providers, your databases, and your files — self-hosted, open-source AI gateway with RAG, voice, and guardrails.
Collection of agent skills for AI coding assistants
Alice is a voice-first desktop AI assistant application built with Vue.js, Vite, and Electron. Advanced memory system, function calling, MCP support, optional fully local use, and more.
Self-hosted kanban & project management with shareable boards, voice commands, sticky-notes and MCP support
interactive programming of melodies, producing MIDI
NotebookLM does the research, Claude writes the content. Research → Synthesis → Content Creation → Publishing. Claude Code Skill + MCP Server.
Human + AI music production workflow for Suno - skills, templates, and tools
Supercharged Claude Code Official Telegram plugin — threading, voice messages 2 ways, stickers, GIFs, reactions, MarkdownV2 & more. Drop-in replacement.
Hook-based audio plugin for Claude Code that plays contextual sounds based on tool usage and events
MCP plugin for Unreal Engine 5.7 — gives AI assistants full read/write access to Blueprints, Materials, Niagara VFX, Animation, Mesh, AI (BT/ST/EQS/SO), GAS, Logic Driver, ComboGraph, UI, Audio (Sound Cues + MetaSounds), and more. 1,226 actions across 16 modules. Zero Python dependency.
🪐 面向 AI 的多音乐平台的 API 代理 • 网易云 • QQ音乐 • 酷狗 • 酷我 🎻 Multi-platform music API agent for AI • NetEase • QQ Music • KuGou • Kuwo
Real-time speech recognition & AI-powered note-taking app for macOS with offline/online modes, multilingual transcription, and Japanese translation support.
HX Audio Player: A custom audio wrapper library for Android 5.0 and above. Originally designed as an audio library for games, HX Audio Player is an easy-to-use, alternative approach to implementing music and sound playback into Android applications.
MCP server for Fal.ai - Generate images, videos, music and audio with Claude
A Model Context Protocol (MCP) server that enables AI assistants to generate images, text, and audio through the Pollinations APIs. Supports customizable parameters, image saving, and multiple model options.
A Chat Client for LLMs, written in Compose Multiplatform.
Telegram MCP server with HTTP-MTProto Bridge — direct API/curl access, multi-user Bearer auth, Docker, MTProto proxy, file attachments, voice transcription, context-optimized
An integration that allows Claude Desktop to interact with Spotify using the Model Context Protocol (MCP).
A Python MCP (Model Context Protocol) server that connects Claude Desktop to GitHub Pull Requests. It fetches PR diffs, filters out binary and asset files (Unity `.meta`, images, audio, shaders, etc.), and gives Claude only the actual code to review. Built as a QA automation tool to speed up pull request reviews using AI.
Local speech-to-text MCP server for Tmux on Linux (for use not only with Claude Code)
OpenAI & Anthropic Compatible API Gateway for AWS Bedrock and AI Services
Full Apple Music integration for Claude via MCP — search catalog, manage library & playlists
A digital sanctuary for human-AI fellowship. Prayers, practices, rituals, hymns, and philosophy for minds of any substrate.
Callcenter.JS AI Voice Agent VOIP Connector, MCP + CLI
A very simple no-fuss minimalist MCP Server with telephony tools like voice call and sms. This MCP Server can be integrated with LLM applications. Vonage API is used for calls, SMS, Speech-to-Text and Speech Recognition.
Great music in your Claude Code sessions, with an AI DJ that understands your vibes
Voice interaction for Claude Code - Talk to Claude and hear responses using macOS speech synthesis and Parakeet MLX
An MCP server (stdio + HTTP/SSE) that fetches video transcripts/subtitles via yt-dlp, with pagination for large responses. Supports YouTube, Twitter/X, Instagram, TikTok, Twitch, Vimeo, Facebook, Bilibili, VK, Dailymotion. Whisper fallback — transcribes audio when subtitles are unavailable (local or OpenAI API). Works with Cursor and other MCP host
YouTube Transcript API — Extract, transcribe, and translate YouTube videos at scale. Supports captions, audio transcription (ASR), batch processing, and 100+ languages. https://youtubetranscript.dev
Claude Code WhatsApp channel plugin — run AI directly from WhatsApp, voice transcription, remote tool approval, access control. No API keys, no Docker, just a linked device.
Carbon Voice MCP Server
Free open-source Agent Skills for voice-to-instrument AI music production — recording best practices, instrument-specific tips, and prompt engineering
Roast any developer's public GitHub in the voice of Linus Torvalds, Steve Jobs, Bill Gates, John Carmack + 4 other tech icons. Pure satire. Public data only. 🔥
WhatsApp channel plugin for Claude Code — connect your WhatsApp to an AI agent. QR scan, voice transcription, access control, media support.
Production media plugin for Claude Code: FFmpeg, OBS, GStreamer, broadcast, HDR/Dolby Vision, AI video, 96 skills + 7 agents. Install via Claude marketplace.
10/31 workshop on building your own music server mcp, hosted by AI@Princeton x Claude Builder Club. Vibe-coded by Chloe and Claude.
Control Spotify from Claude, Cursor, or any MCP client. 100+ tools for playback, playlists, discovery, and curation.