LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.
LocalAI is a self-hosted, open-source AI inference engine written in Go that exposes OpenAI, Anthropic, and ElevenLabs compatible REST APIs, letting developers run LLMs, vision models, speech recognition, text-to-speech, image generation, and video models entirely on local hardware without mandatory GPU. Each modality delegates to a specialist backend such as llama.cpp, vLLM, MLX, whisper.cpp, stable-diffusion, or kokoro, and those backends are pulled as separate container images only when a matching model is loaded, so unused capabilities consume no disk space. Models can be pulled from the built-in gallery, Hugging Face, Ollama's OCI registry, or standard Docker registries via a single CLI command. The project connects to the Claude ecosystem through its Anthropic API compatibility layer and built-in MCP support, meaning Claude Code or any MCP-aware client can route requests through a LocalAI instance instead of Anthropic's cloud. Built-in autonomous agents support tool use and RAG. Multi-user deployments get API key authentication, per-user quotas, and role-based access controls, making the project relevant to privacy-conscious teams, on-premises enterprise deployments, and developers building Claude-compatible applications without sending data off-site.
Self-hosted OpenAI-compatible inference engine that runs LLMs, vision, voice and image models locally without a GPU.
- ✓Open-source license (MIT)
- ✓Actively maintained (<30d)
- ✓Healthy fork ratio
- ✓Clear description
- ✓Topics declared
- ✓Mature repo (>1y old)
git clone https://github.com/mudler/LocalAI && cp LocalAI/*.md ~/.claude/agents/Subagents overview
<h1 align="center"> <br> <img width="300" src="./core/http/static/logo.png"> <br> <br> </h1> <p align="center"> <a href="https://github.com/go-skynet/LocalAI/stargazers" target="blank"> <img src="https://img.shields.io/github/stars/go-skynet/LocalAI?style=for-the-badge" alt="LocalAI stars"/> </a> <a href='https://github.com/go-skynet/LocalAI/releases'> <img src='https://img.shields.io/github/release/go-skynet/LocalAI?&label=Latest&style=for-the-badge'> </a> <a href="LICENSE" target="blank"> <img src="https://img.shields.io/badge/License-MIT-yellow.svg?style=for-the-badge" alt="LocalAI License"/> </a> </p> <p align="center"> <a href="https://twitter.com/LocalAI_API" target="blank"> <img src="https://img.shields.io/badge/X-%23000000.svg?style=for-the-badge&logo=X&logoColor=white&label=LocalAI_API" alt="Follow LocalAI_API"/> </a> <a href="https://discord.gg/uJAeKSAGDy" target="blank"> <img src="https://img.shields.io/badge/dynamic/json?color=blue&label=Discord&style=for-the-badge&query=approximate_member_count&url=https%3A%2F%2Fdiscordapp.com%2Fapi%2Finvites%2FuJAeKSAGDy%3Fwith_counts%3Dtrue&logo=discord" alt="Join LocalAI Discord Community"/> </a> </p> <p align="center"> <a href="https://trendshift.io/repositories/5539" target="_blank"><img src="https://trendshift.io/api/badge/repositories/5539" alt="mudler%2FLocalAI | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a> </p> **LocalAI** is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required. **A small core, not a bundle.** Each backend wraps a best-in-class engine (llama.cpp, vLLM, whisper.cpp, stable-diffusion, MLX...) in its own image, pulled only when a model needs it. You install nothing you don't use. - **Composable by design**: backends are separate and pulled on demand, so you install only what your model needs - **Open and extensible**: load any model, or build your own backend in any language against an open interface - **Drop-in API compatibility**: OpenAI, Anthropic, and ElevenLabs APIs across every backend - **Any model, any modality**: LLMs, vision, voice, image, and video behind one API - **Any hardware**: NVIDIA, AMD, Intel, Apple Silicon, Vulkan, or CPU-only - **Multi-user ready**: API key auth, user quotas, role-based access - **Built-in AI agents**: autonomous agents with tool use, RAG, MCP, and skills - **Privacy-first**: your data never leaves your infrastructure  Created by [Ettore Di Giacinto](https://github.com/mudler) and maintained by the [LocalAI team](#team). > [:book: Documentation](https://localai.io/) | [:speech_balloon: Discord](https://discord.gg/uJAeKSAGDy) | [💻 Quickstart](https://localai.io/basics/getting_started/) | [🖼️ Models](https://models.localai.io/) | [❓FAQ](https://localai.io/faq/) ## Guided tour https://github.com/user-attachments/assets/08cbb692-57da-48f7-963d-2e7b43883c18 <details> <summary> Click to see more! </summary> #### User and auth https://github.com/user-attachments/assets/228fa9ad-81a3-4d43-bfb9-31557e14a36c #### Agents https://github.com/user-attachments/assets/6270b331-e21d-4087-a540-6290006b381a #### Usage metrics per user https://github.com/user-attachments/assets/cbb03379-23b4-4e3d-bd26-d152f057007f #### Fine-tuning and Quantization https://github.com/user-attachments/assets/5ba4ace9-d3df-4795-b7d4-b0b404ea71ee #### WebRTC https://github.com/user-attachments/assets/ed88e34c-fed3-4b83-8a67-4716a9feeb7b </details> ## Quickstart ### macOS <a href="https://github.com/mudler/LocalAI/releases/latest/download/LocalAI.dmg"> <img src="https://img.shields.io/badge/Download-macOS-blue?style=for-the-badge&logo=apple&logoColor=white" alt="Download LocalAI for macOS"/> </a> > **Note:** The DMG is not signed by Apple. After installing, run: `sudo xattr -d com.apple.quarantine /Applications/LocalAI.app`. See [#6268](https://github.com/mudler/LocalAI/issues/6268) for details. ### Containers (Docker, podman, ...) > Already ran LocalAI before? Use `docker start -i local-ai` to restart an existing container. #### CPU only: ```bash docker run -ti --name local-ai -p 8080:8080 localai/localai:latest ``` #### NVIDIA GPU: ```bash # CUDA 13 docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-13 # CUDA 12 docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-12 # NVIDIA Jetson ARM64 (CUDA 12, for AGX Orin and similar) docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-nvidia-l4t-arm64 # NVIDIA Jetson ARM64 (CUDA 13, for DGX Spark) docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-nvidia-l4t-arm64-cuda-13 ``` #### AMD GPU (ROCm): ```bash docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri --group-add=video localai/localai:latest-gpu-hipblas ``` #### Intel GPU (oneAPI): ```bash docker run -ti --name local-ai -p 8080:8080 --device=/dev/dri/card1 --device=/dev/dri/renderD128 localai/localai:latest-gpu-intel ``` #### Vulkan GPU: ```bash docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-vulkan ``` ### Loading models ```bash # From the model gallery (see available models with `local-ai models list` or at https://models.localai.io) local-ai run llama-3.2-1b-instruct:q4_k_m # From Huggingface local-ai run huggingface://TheBloke/phi-2-GGUF/phi-2.Q8_0.gguf # From the Ollama OCI registry local-ai run ollama://gemma:2b # From a YAML config local-ai run https://gist.githubusercontent.com/.../phi-2.yaml # From a standard OCI registry (e.g., Docker Hub) local-ai run oci://localai/phi-2:latest ``` To test a running LocalAI server from the terminal, open an interactive chat session from another shell. Inside the prompt, `/models` lists installed models and `/model <name>` switches between them. ```bash # Terminal 1 local-ai run llama-3.2-1b-instruct:q4_k_m # Terminal 2 local-ai chat --model llama-3.2-1b-instruct:q4_k_m ``` > **Automatic Backend Detection**: LocalAI automatically detects your GPU capabilities and downloads the appropriate backend. For advanced options, see [GPU Acceleration](https://localai.io/features/gpu-acceleration/). For more details, see the [Getting Started guide](https://localai.io/basics/getting_started/). ## Latest News - **May 2026**: **LocalAI 4.3.0** - `llama.cpp` [prompt cache on by default](https://github.com/mudler/LocalAI/pull/9925) (repeated system prompts collapse from minutes to seconds), [keyless cosign signing of backend OCI images](https://github.com/mudler/LocalAI/pull/9823), [per-API-key + per-user usage attribution](https://github.com/mudler/LocalAI/pull/9920), Distributed v3 with [per-request replica routing](https://github.com/mudler/LocalAI/pull/9968). [Release notes](https://github.com/mudler/LocalAI/releases/tag/v4.3.0) - **May 2026**: **LocalAI 4.2.0** - LocalAI sees and hears: [voice recognition](https://github.com/mudler/LocalAI/pull/9500), [face recognition + antispoofing liveness](https://github.com/mudler/LocalAI/pull/9480), speaker diarization. Plus [drop-in Ollama API](https://github.com/mudler/LocalAI/pull/9284), [video generation](https://github.com/mudler/LocalAI/pull/9420), redesigned UI with i18n + admin-configurable branding, vLLM at feature parity with llama.cpp, and 11 new backends. [Release notes](https://github.com/mudler/LocalAI/releases/tag/v4.2.0) - **April 2026**: **LocalAI 4.1.0** - LocalAI becomes a control tower: distributed cluster mode with VRAM-aware smart routing + autoscaling, multi-user platform with OIDC and API keys, per-user quotas with predictive analytics, in-UI fine-tuning with TRL (auto-export to GGUF), on-the-fly quantization backend, visual pipeline editor. [Release notes](https://github.com/mudler/LocalAI/releases/tag/v4.1.0) - **March 2026**: **LocalAI 4.0.0** - native agentic orchestration with the new [Agenthub](https://agenthub.localai.io) community hub, full React UI rewrite with Canvas mode, [MCP Apps + client-side](https://github.com/mudler/LocalAI/pull/8947) with tool streaming, [WebRTC realtime audio](https://github.com/mudler/LocalAI/pull/8790), [MLX-distributed](https://github.com/mudler/LocalAI/pull/8801). [Release notes](https://github.com/mudler/LocalAI/releases/tag/v4.0.0) - **February 2026**: [Realtime API for audio-to-audio with tool calling](https://github.com/mudler/LocalAI/pull/6245), [ACE-Step 1.5 support](https://github.com/mudler/LocalAI/pull/8396) - **January 2026**: **LocalAI 3.10.0** — Anthropic API support, Open Responses API, video & image generation (LTX-2), unified GPU backends, tool streaming, Moonshine, Pocket-TTS. [Release notes](https://github.com/mudler/LocalAI/releases/tag/v3.10.0) - **December 2025**: [Dynamic Memory Resource reclaimer](https://github.com/mudler/LocalAI/pull/7583), [Automatic multi-GPU model fitting (llama.cpp)](https://github.com/mudler/LocalAI/pull/7584), [Vibevoice backend](https://github.com/mudler/LocalAI/pull/7494) - **November 2025**: [Import models via URL](https://github.com/mudler/LocalAI/pull/7245), [Multiple chats and history](https://github.com/mudler/LocalAI/pull/7325) - **October 2025**: [Model Context Protocol (MCP)](https://localai.io/docs/features/mcp/) support for agentic capabilities - **September 2025**: New Launcher for macOS and Linux, extended backend support for Mac and Nvidia L4T, MLX-Audio, WAN 2.2 - **August 2025**: MLX, MLX-VLM, Diffusers, llama.cpp now supported on Apple Silicon - **July 2025**: All backends migrated outside the main binary — [lightweight, modular architecture](https://github.com/mudler/LocalAI/releases/tag/v3.2.0) For older news and full release notes, see [GitHub Releases](https://github.com/mudler/LocalAI/releases) and the [News
What people ask about LocalAI
What is mudler/LocalAI?
+
mudler/LocalAI is subagents for the Claude AI ecosystem. LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required. It has 46.8k GitHub stars and was last updated today.
How do I install LocalAI?
+
You can install LocalAI by cloning the repository (https://github.com/mudler/LocalAI) or following the README instructions on GitHub. ClaudeWave also provides quick install blocks on this page.
Is mudler/LocalAI safe to use?
+
Our security agent has analyzed mudler/LocalAI and assigned a Trust Score of 93/100 (tier: Verified). See the full breakdown of passed checks and flags on this page.
Who maintains mudler/LocalAI?
+
mudler/LocalAI is maintained by mudler. The last recorded GitHub activity is from today, with 196 open issues.
Are there alternatives to LocalAI?
+
Yes. On ClaudeWave you can browse similar subagents at /categories/agents, sorted by popularity or recent activity.
Deploy LocalAI to your cloud
Ship this repo to production in minutes. Each platform spins up its own environment with editable env vars.
Maintain this repo? Add a badge to your README
Drop the badge into your GitHub README to show it's tracked on ClaudeWave. Each badge links back to this page and reflects the live Trust Score.
[](https://claudewave.com/repo/mudler-localai)<a href="https://claudewave.com/repo/mudler-localai"><img src="https://claudewave.com/api/badge/mudler-localai" alt="Featured on ClaudeWave: mudler/LocalAI" width="320" height="64" /></a>More Subagents
The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.
The agent that grows with you
Java 面试 & 后端通用面试指南,覆盖计算机基础、数据库、分布式、高并发、系统设计与 AI 应用开发
Production-ready platform for agentic workflow development.
The agent engineering platform.
🤯 LobeHub is your Chief Agent Operator, organizing your agents into 7×24 operations by hiring, scheduling, and reporting on your entire AI team.