LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.
LocalAI is a self-hosted, open-source AI inference engine written in Go that exposes OpenAI, Anthropic, and ElevenLabs compatible REST APIs, letting developers run LLMs, vision models, speech recognition, text-to-speech, image generation, and video models entirely on local hardware without mandatory GPU. Each modality delegates to a specialist backend such as llama.cpp, vLLM, MLX, whisper.cpp, stable-diffusion, or kokoro, and those backends are pulled as separate container images only when a matching model is loaded, so unused capabilities consume no disk space. Models can be pulled from the built-in gallery, Hugging Face, Ollama's OCI registry, or standard Docker registries via a single CLI command. The project connects to the Claude ecosystem through its Anthropic API compatibility layer and built-in MCP support, meaning Claude Code or any MCP-aware client can route requests through a LocalAI instance instead of Anthropic's cloud. Built-in autonomous agents support tool use and RAG. Multi-user deployments get API key authentication, per-user quotas, and role-based access controls, making the project relevant to privacy-conscious teams, on-premises enterprise deployments, and developers building Claude-compatible applications without sending data off-site.
Self-hosted OpenAI-compatible inference engine that runs LLMs, vision, voice and image models locally without a GPU.
- ✓Open-source license (MIT)
- ✓Actively maintained (<30d)
- ✓Healthy fork ratio
- ✓Clear description
- ✓Topics declared
- ✓Mature repo (>1y old)
git clone https://github.com/mudler/LocalAI && cp LocalAI/*.md ~/.claude/agents/Resumen de Subagents
<h1 align="center"> <br> <img width="300" src="./core/http/static/logo.png"> <br> <br> </h1> <p align="center"> <a href="https://github.com/go-skynet/LocalAI/stargazers" target="blank"> <img src="https://img.shields.io/github/stars/go-skynet/LocalAI?style=for-the-badge" alt="LocalAI stars"/> </a> <a href='https://github.com/go-skynet/LocalAI/releases'> <img src='https://img.shields.io/github/release/go-skynet/LocalAI?&label=Latest&style=for-the-badge'> </a> <a href="LICENSE" target="blank"> <img src="https://img.shields.io/badge/License-MIT-yellow.svg?style=for-the-badge" alt="LocalAI License"/> </a> </p> <p align="center"> <a href="https://twitter.com/LocalAI_API" target="blank"> <img src="https://img.shields.io/badge/X-%23000000.svg?style=for-the-badge&logo=X&logoColor=white&label=LocalAI_API" alt="Follow LocalAI_API"/> </a> <a href="https://discord.gg/uJAeKSAGDy" target="blank"> <img src="https://img.shields.io/badge/dynamic/json?color=blue&label=Discord&style=for-the-badge&query=approximate_member_count&url=https%3A%2F%2Fdiscordapp.com%2Fapi%2Finvites%2FuJAeKSAGDy%3Fwith_counts%3Dtrue&logo=discord" alt="Join LocalAI Discord Community"/> </a> </p> <p align="center"> <a href="https://trendshift.io/repositories/5539" target="_blank"><img src="https://trendshift.io/api/badge/repositories/5539" alt="mudler%2FLocalAI | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a> </p> **LocalAI** is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required. **A small core, not a bundle.** Each backend wraps a best-in-class engine (llama.cpp, vLLM, whisper.cpp, stable-diffusion, MLX...) in its own image, pulled only when a model needs it. You install nothing you don't use. - **Composable by design**: backends are separate and pulled on demand, so you install only what your model needs - **Open and extensible**: load any model, or build your own backend in any language against an open interface - **Drop-in API compatibility**: OpenAI, Anthropic, and ElevenLabs APIs across every backend - **Any model, any modality**: LLMs, vision, voice, image, and video behind one API - **Any hardware**: NVIDIA, AMD, Intel, Apple Silicon, Vulkan, or CPU-only - **Multi-user ready**: API key auth, user quotas, role-based access - **Built-in AI agents**: autonomous agents with tool use, RAG, MCP, and skills - **Privacy-first**: your data never leaves your infrastructure  Created by [Ettore Di Giacinto](https://github.com/mudler) and maintained by the [LocalAI team](#team). > [:book: Documentation](https://localai.io/) | [:speech_balloon: Discord](https://discord.gg/uJAeKSAGDy) | [💻 Quickstart](https://localai.io/basics/getting_started/) | [🖼️ Models](https://models.localai.io/) | [❓FAQ](https://localai.io/faq/) ## Guided tour https://github.com/user-attachments/assets/08cbb692-57da-48f7-963d-2e7b43883c18 <details> <summary> Click to see more! </summary> #### User and auth https://github.com/user-attachments/assets/228fa9ad-81a3-4d43-bfb9-31557e14a36c #### Agents https://github.com/user-attachments/assets/6270b331-e21d-4087-a540-6290006b381a #### Usage metrics per user https://github.com/user-attachments/assets/cbb03379-23b4-4e3d-bd26-d152f057007f #### Fine-tuning and Quantization https://github.com/user-attachments/assets/5ba4ace9-d3df-4795-b7d4-b0b404ea71ee #### WebRTC https://github.com/user-attachments/assets/ed88e34c-fed3-4b83-8a67-4716a9feeb7b </details> ## Quickstart ### macOS <a href="https://github.com/mudler/LocalAI/releases/latest/download/LocalAI.dmg"> <img src="https://img.shields.io/badge/Download-macOS-blue?style=for-the-badge&logo=apple&logoColor=white" alt="Download LocalAI for macOS"/> </a> > **Note:** The DMG is not signed by Apple. After installing, run: `sudo xattr -d com.apple.quarantine /Applications/LocalAI.app`. See [#6268](https://github.com/mudler/LocalAI/issues/6268) for details. ### Containers (Docker, podman, ...) > Already ran LocalAI before? Use `docker start -i local-ai` to restart an existing container. #### CPU only: ```bash docker run -ti --name local-ai -p 8080:8080 localai/localai:latest ``` #### NVIDIA GPU: ```bash # CUDA 13 docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-13 # CUDA 12 docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-12 # NVIDIA Jetson ARM64 (CUDA 12, for AGX Orin and similar) docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-nvidia-l4t-arm64 # NVIDIA Jetson ARM64 (CUDA 13, for DGX Spark) docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-nvidia-l4t-arm64-cuda-13 ``` #### AMD GPU (ROCm): ```bash docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri --group-add=video localai/localai:latest-gpu-hipblas ``` #### Intel GPU (oneAPI): ```bash docker run -ti --name local-ai -p 8080:8080 --device=/dev/dri/card1 --device=/dev/dri/renderD128 localai/localai:latest-gpu-intel ``` #### Vulkan GPU: ```bash docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-vulkan ``` ### Loading models ```bash # From the model gallery (see available models with `local-ai models list` or at https://models.localai.io) local-ai run llama-3.2-1b-instruct:q4_k_m # From Huggingface local-ai run huggingface://TheBloke/phi-2-GGUF/phi-2.Q8_0.gguf # From the Ollama OCI registry local-ai run ollama://gemma:2b # From a YAML config local-ai run https://gist.githubusercontent.com/.../phi-2.yaml # From a standard OCI registry (e.g., Docker Hub) local-ai run oci://localai/phi-2:latest ``` To test a running LocalAI server from the terminal, open an interactive chat session from another shell. Inside the prompt, `/models` lists installed models and `/model <name>` switches between them. ```bash # Terminal 1 local-ai run llama-3.2-1b-instruct:q4_k_m # Terminal 2 local-ai chat --model llama-3.2-1b-instruct:q4_k_m ``` > **Automatic Backend Detection**: LocalAI automatically detects your GPU capabilities and downloads the appropriate backend. For advanced options, see [GPU Acceleration](https://localai.io/features/gpu-acceleration/). For more details, see the [Getting Started guide](https://localai.io/basics/getting_started/). ## Latest News - **May 2026**: **LocalAI 4.3.0** - `llama.cpp` [prompt cache on by default](https://github.com/mudler/LocalAI/pull/9925) (repeated system prompts collapse from minutes to seconds), [keyless cosign signing of backend OCI images](https://github.com/mudler/LocalAI/pull/9823), [per-API-key + per-user usage attribution](https://github.com/mudler/LocalAI/pull/9920), Distributed v3 with [per-request replica routing](https://github.com/mudler/LocalAI/pull/9968). [Release notes](https://github.com/mudler/LocalAI/releases/tag/v4.3.0) - **May 2026**: **LocalAI 4.2.0** - LocalAI sees and hears: [voice recognition](https://github.com/mudler/LocalAI/pull/9500), [face recognition + antispoofing liveness](https://github.com/mudler/LocalAI/pull/9480), speaker diarization. Plus [drop-in Ollama API](https://github.com/mudler/LocalAI/pull/9284), [video generation](https://github.com/mudler/LocalAI/pull/9420), redesigned UI with i18n + admin-configurable branding, vLLM at feature parity with llama.cpp, and 11 new backends. [Release notes](https://github.com/mudler/LocalAI/releases/tag/v4.2.0) - **April 2026**: **LocalAI 4.1.0** - LocalAI becomes a control tower: distributed cluster mode with VRAM-aware smart routing + autoscaling, multi-user platform with OIDC and API keys, per-user quotas with predictive analytics, in-UI fine-tuning with TRL (auto-export to GGUF), on-the-fly quantization backend, visual pipeline editor. [Release notes](https://github.com/mudler/LocalAI/releases/tag/v4.1.0) - **March 2026**: **LocalAI 4.0.0** - native agentic orchestration with the new [Agenthub](https://agenthub.localai.io) community hub, full React UI rewrite with Canvas mode, [MCP Apps + client-side](https://github.com/mudler/LocalAI/pull/8947) with tool streaming, [WebRTC realtime audio](https://github.com/mudler/LocalAI/pull/8790), [MLX-distributed](https://github.com/mudler/LocalAI/pull/8801). [Release notes](https://github.com/mudler/LocalAI/releases/tag/v4.0.0) - **February 2026**: [Realtime API for audio-to-audio with tool calling](https://github.com/mudler/LocalAI/pull/6245), [ACE-Step 1.5 support](https://github.com/mudler/LocalAI/pull/8396) - **January 2026**: **LocalAI 3.10.0** — Anthropic API support, Open Responses API, video & image generation (LTX-2), unified GPU backends, tool streaming, Moonshine, Pocket-TTS. [Release notes](https://github.com/mudler/LocalAI/releases/tag/v3.10.0) - **December 2025**: [Dynamic Memory Resource reclaimer](https://github.com/mudler/LocalAI/pull/7583), [Automatic multi-GPU model fitting (llama.cpp)](https://github.com/mudler/LocalAI/pull/7584), [Vibevoice backend](https://github.com/mudler/LocalAI/pull/7494) - **November 2025**: [Import models via URL](https://github.com/mudler/LocalAI/pull/7245), [Multiple chats and history](https://github.com/mudler/LocalAI/pull/7325) - **October 2025**: [Model Context Protocol (MCP)](https://localai.io/docs/features/mcp/) support for agentic capabilities - **September 2025**: New Launcher for macOS and Linux, extended backend support for Mac and Nvidia L4T, MLX-Audio, WAN 2.2 - **August 2025**: MLX, MLX-VLM, Diffusers, llama.cpp now supported on Apple Silicon - **July 2025**: All backends migrated outside the main binary — [lightweight, modular architecture](https://github.com/mudler/LocalAI/releases/tag/v3.2.0) For older news and full release notes, see [GitHub Releases](https://github.com/mudler/LocalAI/releases) and the [News
Lo que la gente pregunta sobre LocalAI
¿Qué es mudler/LocalAI?
+
mudler/LocalAI es subagents para el ecosistema de Claude AI. LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required. Tiene 46.8k estrellas en GitHub y se actualizó por última vez today.
¿Cómo se instala LocalAI?
+
Puedes instalar LocalAI clonando el repositorio (https://github.com/mudler/LocalAI) o siguiendo las instrucciones del README en GitHub. ClaudeWave también te ofrece bloques de instalación rápida en esta misma página.
¿Es seguro usar mudler/LocalAI?
+
Nuestro agente de seguridad ha analizado mudler/LocalAI y le ha asignado un Trust Score de 93/100 (tier: Verified). Revisa el desglose completo de comprobaciones superadas y flags en esta página.
¿Quién mantiene mudler/LocalAI?
+
mudler/LocalAI es mantenido por mudler. La última actividad registrada en GitHub es de today, con 196 issues abiertos.
¿Hay alternativas a LocalAI?
+
Sí. En ClaudeWave puedes explorar subagents similares en /categories/agents, ordenados por popularidad o actividad reciente.
Despliega LocalAI en tu cloud
Lleva este repo a producción en minutos. Cada plataforma genera su propio entorno con variables de entorno editables.
¿Mantienes este repo? Añade un badge a tu README
Pega el badge en tu README de GitHub para mostrar que está auditado por ClaudeWave. Cada badge enlaza de vuelta a esta página y muestra el Trust Score actual.
[](https://claudewave.com/repo/mudler-localai)<a href="https://claudewave.com/repo/mudler-localai"><img src="https://claudewave.com/api/badge/mudler-localai" alt="Featured on ClaudeWave: mudler/LocalAI" width="320" height="64" /></a>Más Subagents
The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.
The agent that grows with you
Java 面试 & 后端通用面试指南,覆盖计算机基础、数据库、分布式、高并发、系统设计与 AI 应用开发
Production-ready platform for agentic workflow development.
The agent engineering platform.
🤯 LobeHub is your Chief Agent Operator, organizing your agents into 7×24 operations by hiring, scheduling, and reporting on your entire AI team.