ClaudeWave
waybarrios
waybarrios

vllm-mlx

View on GitHub

OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.

Tools829 stars188 forksPythonApache-2.0Updated today
ClaudeWave Trust Score
97/100
Verified
Passed
  • Open-source license (Apache-2.0)
  • Actively maintained (<30d)
  • Healthy fork ratio
  • Clear description
  • Topics declared
Last scanned: 4/14/2026
Install in Claude Desktop
Method detected: Manual
{
  "mcpServers": {
    "vllm-mlx": {
      "command": "node",
      "args": ["/path/to/vllm-mlx/dist/index.js"]
    }
  }
}
1. Copy the snippet above.
2. Paste into ~/Library/Application Support/Claude/claude_desktop_config.json (Mac) or %APPDATA%\Claude\claude_desktop_config.json (Windows).
3. Replace any <placeholder> values with your API keys or paths.
4. Restart Claude Desktop. The MCP server appears automatically.
💡 Clone https://github.com/waybarrios/vllm-mlx and follow its README for install instructions.
Use cases
🎬 Media🧠 AI / ML🎨 Creative

Tools overview

README preview not available. Visit the repo on GitHub for full documentation.
anthropicapple-siliconaudio-processingclaude-codecomputer-visionimage-understandinginferencellmmachine-learningmacosmllmmlxmultimodal-aispeech-to-textstttext-to-speechttsvideo-understandingvision-language-modelvllm

More Tools