The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
MCP Servers29.4k stars2.9k forks● TypeScriptApache-2.0Updated 18d ago
ClaudeWave Trust Score
87/100
ByteDance’s open-source multimodal agent stack including a desktop ‘computer-use’ agent that drives the OS and browser.
Passed
- ✓Open-source license (Apache-2.0)
- ✓Actively maintained (<30d)
- ✓Healthy fork ratio
- ✓Clear description
- ✓Topics declared
- ✓Mature repo (>1y old)
Use with caution
Last scanned: 4/14/2026
Install in Claude Desktop
Method detected: NPX · @agent-tars/cli
{
"mcpServers": {
"ui-tars-desktop": {
"command": "npx",
"args": ["-y", "@agent-tars/cli"]
}
}
}1. Copy the snippet above.
2. Paste into
~/Library/Application Support/Claude/claude_desktop_config.json (Mac) or %APPDATA%\Claude\claude_desktop_config.json (Windows).3. Replace any
<placeholder> values with your API keys or paths.4. Restart Claude Desktop. The MCP server appears automatically.
Use cases
🎨 Creative🛠️ Dev Tools⚡ Productivity
About
MCP Servers overview
<picture>
<img alt="Agent TARS Banner" src="./images/tars.png">
</picture>
<br/>
## Introduction
English | [简体中文](./README.zh-CN.md)
[](https://trendshift.io/repositories/13584)
<b>TARS<sup>\*</sup></b> is a Multimodal AI Agent stack, currently shipping two projects: [Agent TARS](#agent-tars) and [UI-TARS-desktop](#ui-tars-desktop):
<table>
<thead>
<tr>
<th width="50%" align="center"><a href="#agent-tars">Agent TARS</a></th>
<th width="50%" align="center"><a href="#ui-tars-desktop">UI-TARS-desktop</a></th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">
<video src="https://github.com/user-attachments/assets/c9489936-afdc-4d12-adda-d4b90d2a869d" width="50%"></video>
</td>
<td align="center">
<video src="https://github.com/user-attachments/assets/e0914ce9-ad33-494b-bdec-0c25c1b01a27" width="50%"></video>
</td>
</tr>
<tr>
<td align="left">
<b>Agent TARS</b> is a general multimodal AI Agent stack, it brings the power of GUI Agent and Vision into your terminal, computer, browser and product.
<br>
<br>
It primarily ships with a <a href="https://agent-tars.com/guide/basic/cli.html" target="_blank">CLI</a> and <a href="https://agent-tars.com/guide/basic/web-ui.html" target="_blank">Web UI</a> for usage.
It aims to provide a workflow that is closer to human-like task completion through cutting-edge multimodal LLMs and seamless integration with various real-world <a href="https://agent-tars.com/guide/basic/mcp.html" target="_blank">MCP</a> tools.
</td>
<td align="left">
<b>UI-TARS Desktop</b> is a desktop application that provides a native GUI Agent based on the <a href="https://github.com/bytedance/UI-TARS" target="_blank">UI-TARS</a> model.
<br>
<br>
It primarily ships a
<a href="https://github.com/bytedance/UI-TARS-desktop/blob/main/docs/quick-start.md#get-model-and-run-local-operator" target="_blank">local</a> and
<a href="https://github.com/bytedance/UI-TARS-desktop/blob/main/docs/quick-start.md#run-remote-operator" target="_blank">remote</a> computer as well as browser operators.
</td>
</tr>
</tbody>
</table>
## Table of Contents
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
- [News](#news)
- [Agent TARS](#agent-tars)
- [Showcase](#showcase)
- [Core Features](#core-features)
- [Quick Start](#quick-start)
- [Documentation](#documentation)
- [UI-TARS Desktop](#ui-tars-desktop)
- [Showcase](#showcase-1)
- [Features](#features)
- [Quick Start](#quick-start-1)
- [Contributing](#contributing)
- [License](#license)
- [Citation](#citation)
<!-- END doctoc generated TOC please keep comment here to allow auto update -->
## News
- **\[2025-11-05\]** 🎉 We're excited to announce the release of [Agent TARS CLI v0.3.0](https://github.com/bytedance/UI-TARS-desktop/releases/tag/v0.3.0)! This version brings streaming support for multiple tools (shell commands, multi-file structured display), runtime settings with timing statistics for tool calls and deep thinking, Event Stream Viewer for data flow tracking and debugging. Additionally, it features exclusive support for [AIO agent Sandbox](https://github.com/agent-infra/sandbox) as isolated all-in-one tools execution environment.
- **\[2025-06-25\]** We released an Agent TARS Beta and Agent TARS CLI - [Introducing Agent TARS Beta](https://agent-tars.com/blog/2025-06-25-introducing-agent-tars-beta.html), a multimodal AI agent that aims to explore a work form that is closer to human-like task completion through rich multimodal capabilities (such as GUI Agent, Vision) and seamless integration with various real-world tools.
- **\[2025-06-12\]** - 🎁 We are thrilled to announce the release of UI-TARS Desktop v0.2.0! This update introduces two powerful new features: **Remote Computer Operator** and **Remote Browser Operator**—both completely free. No configuration required: simply click to remotely control any computer or browser, and experience a new level of convenience and intelligence.
- **\[2025-04-17\]** - 🎉 We're thrilled to announce the release of new UI-TARS Desktop application v0.1.0, featuring a redesigned Agent UI. The application enhances the computer using experience, introduces new browser operation features, and supports [the advanced UI-TARS-1.5 model](https://seed-tars.com/1.5) for improved performance and precise control.
- **\[2025-02-20\]** - 📦 Introduced [UI TARS SDK](./docs/sdk.md), is a powerful cross-platform toolkit for building GUI automation agents.
- **\[2025-01-23\]** - 🚀 We updated the **[Cloud Deployment](./docs/deployment.md#cloud-deployment)** section in the 中文版: [GUI模型部署教程](https://bytedance.sg.larkoffice.com/docx/TCcudYwyIox5vyxiSDLlgIsTgWf#U94rdCxzBoJMLex38NPlHL21gNb) with new information related to the ModelScope platform. You can now use the ModelScope platform for deployment.
<br>
## Agent TARS
<p>
<a href="https://npmjs.com/package/@agent-tars/cli?activeTab=readme"><img src="https://img.shields.io/npm/v/@agent-tars/cli?style=for-the-badge&colorA=1a1a2e&colorB=3B82F6&logo=npm&logoColor=white" alt="npm version" /></a>
<a href="https://npmcharts.com/compare/@agent-tars/cli?minimal=true"><img src="https://img.shields.io/npm/dm/@agent-tars/cli.svg?style=for-the-badge&colorA=1a1a2e&colorB=0EA5E9&logo=npm&logoColor=white" alt="downloads" /></a>
<a href="https://nodejs.org/en/about/previous-releases"><img src="https://img.shields.io/node/v/@agent-tars/cli.svg?style=for-the-badge&colorA=1a1a2e&colorB=06B6D4&logo=node.js&logoColor=white" alt="node version"></a>
<a href="https://discord.gg/HnKcSBgTVx"><img src="https://img.shields.io/badge/Discord-Join%20Community-5865F2?style=for-the-badge&logo=discord&logoColor=white" alt="Discord Community" /></a>
<a href="https://twitter.com/agent_tars"><img src="https://img.shields.io/badge/Twitter-Follow%20%40agent__tars-1DA1F2?style=for-the-badge&logo=twitter&logoColor=white" alt="Official Twitter" /></a>
<a href="https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=deen76f4-ea3c-4964-93a3-78f126f39651"><img src="https://img.shields.io/badge/飞书群-加入交流群-00D4AA?style=for-the-badge&logo=lark&logoColor=white" alt="飞书交流群" /></a>
<a href="https://deepwiki.com/bytedance/UI-TARS-desktop"><img src="https://img.shields.io/badge/DeepWiki-Ask%20AI-8B5CF6?style=for-the-badge&logo=gitbook&logoColor=white" alt="Ask DeepWiki" /></a>
</p>
<b>Agent TARS</b> is a general multimodal AI Agent stack, it brings the power of GUI Agent and Vision into your terminal, computer, browser and product. <br> <br>
It primarily ships with a <a href="https://agent-tars.com/guide/basic/cli.html" target="_blank">CLI</a> and <a href="https://agent-tars.com/guide/basic/web-ui.html" target="_blank">Web UI</a> for usage.
It aims to provide a workflow that is closer to human-like task completion through cutting-edge multimodal LLMs and seamless integration with various real-world <a href="https://agent-tars.com/guide/basic/mcp.html" target="_blank">MCP</a> tools.
### Showcase
```
Please help me book the earliest flight from San Jose to New York on September 1st and the last return flight on September 6th on Priceline
```
https://github.com/user-attachments/assets/772b0eef-aef7-4ab9-8cb0-9611820539d8
<br>
<table>
<thead>
<tr>
<th width="50%" align="center">Booking Hotel</th>
<th width="50%" align="center">Generate Chart with extra MCP Servers</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">
<video src="https://github.com/user-attachments/assets/c9489936-afdc-4d12-adda-d4b90d2a869d" width="50%"></video>
</td>
<td align="center">
<video src="https://github.com/user-attachments/assets/a9fd72d0-01bb-4233-aa27-ca95194bbce9" width="50%"></video>
</td>
</tr>
<tr>
<td align="left">
<b>Instruction:</b> <i>I am in Los Angeles from September 1st to September 6th, with a budget of $5,000. Please help me book a Ritz-Carlton hotel closest to the airport on booking.com and compile a transportation guide for me</i>
</td>
<td align="left">
<b>Instruction:</b> <i>Draw me a chart of Hangzhou's weather for one month</i>
</td>
</tr>
</tbody>
</table>
For more use cases, please check out [#842](https://github.com/bytedance/UI-TARS-desktop/issues/842).
### Core Features
- 🖱️ **One-Click Out-of-the-box CLI** - Supports both **headful** [Web UI](https://agent-tars.com/guide/basic/web-ui.html) and **headless** [server](https://agent-tars.com/guide/advanced/server.html) [execution](https://agent-tars.com/guide/basic/cli.html).
- 🌐 **Hybrid Browser Agent** - Control browsers using [GUI Agent](https://agent-tars.com/guide/basic/browser.html#visual-grounding), [DOM](https://agent-tars.com/guide/basic/browser.html#dom), or a hybrid strategy.
- 🔄 **Event Stream** - Protocol-driven Event Stream drives [Context Engineering](https://agent-tars.com/beta#context-engineering) and [Agent UI](https://agent-tars.com/blog/2025-06-25-introducing-agent-tars-beta.html#easy-to-build-applications).
- 🧰 **MCP Integration** - The kernel is built on MCP and also supports mounting [MCP Servers](https://agent-tars.com/guide/basic/mcp.html) to connect to real-world tools.
### Quick Start
<img alt="Agent TARS CLI" src="https://agent-tars.com/agent-tars-cli.png">
```bash
# Launch with `npx`.
npx @agent-tars/cli@latest
# Install globally, required Node.js >= 22
npm install @agent-tars/cli@latest -g
# Run with your preferred model provider
agent-tars --provider volcengine --model doubao-1-5-thinking-vision-pro-250428 --apiKey your-api-key
agent-tars --provider anthropic --model claude-3-7-sonnet-latest --apiKey your-api-key
```
Visit the compreTopics
agentagent-tarsbrowser-usecomputer-usecoworkgui-agentgui-operatormcpmcp-servermultimodaltarsui-tarsvisionvlm
Related
More MCP Servers
n8n-io
n8n
✓95
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
184k56.8kTypeScript· today
MCP Serversaiapis
open-webui
open-webui
✓89
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
131.8k18.7kPython· today
MCP Serversaillm
google-gemini
gemini-cli
✓98
An open-source AI agent that brings the power of Gemini directly into your terminal.
101.2k13.1kTypeScript· today
MCP Serversaiai-agents
punkpeye
awesome-mcp-servers
✓87
A collection of MCP servers.
84.8k9.1k· today
MCP Serversaimcp
netdata
netdata
✓97
The fastest path to AI-powered full stack observability, even for lean teams.
78.4k6.4kC· today
MCP Serversaialerting
Mintplex-Labs
anything-llm
✓93
The all-in-one AI productivity accelerator. On device and privacy first with no annoying setup or configuration.
58.3k6.3kJavaScript· today
MCP Serversai-agentscustom-ai-agents