subagent
Run a one-shot or flat parallel batch of provider LLM subagents (headless `cc-fleet subagent`) that return a result. Use when fanning out N independent tasks, doing bulk per-file work, or calling a specialized provider model (DeepSeek / GLM / Kimi / Qwen / MiniMax). NOT a long-lived collaborator you message back and forth (that is /cc-fleet:team); NOT a multi-phase pipeline with dependencies or resume (that is /cc-fleet:workflow); NOT trivial work the main session should just do.
git clone --depth 1 https://github.com/ethanhq/cc-fleet /tmp/subagent && cp -r /tmp/subagent/skills/subagent ~/.claude/skills/subagentSKILL.md
# subagent — one-shot / batch / background provider subagent **Wrong lane?** A long-lived collaborator you message back and forth → /cc-fleet:team; a multi-phase pipeline with dependencies or resume → /cc-fleet:workflow; arbitration in cc-fleet-shared/routing.md. When this skill cites `cc-fleet-shared/<file>.md`, OPEN it with the Read tool at `../cc-fleet-shared/<file>.md` relative to this SKILL.md — the cited content is load-bearing, not optional background. `cc-fleet subagent` runs a provider model headless and returns the result directly on Bash stdout — **no** tmux pane, **no** `TeamCreate`/`SendMessage`/`TeamDelete`. The analog of the native `Agent`/`Task` tool, but the model can be a provider id. It reuses the same provider selection and the same fingerprint self-heal flow as spawn (cc-fleet-shared/troubleshooting.md). It's the lightweight synchronous branch. ## When to use it - **One-shot research / analysis / judgement** — you want an answer, not a long-lived colleague. - **Batch fan-out** — N independent tasks in parallel. subagent is **lock-free**, so N calls don't serialize behind a server lock the way N spawns do — true parallelism. - **Cost-bounded probes** — `--timeout` caps wall-clock; the return value carries `usage` / `total_cost_usd`; it stops when done (nothing to forget to tear down). ## The provider ask ladder (ask at most once per task) 1. The user named a provider or model → use it. 2. Else run `cc-fleet default --json`: if it returns a provider (source "configured" or "auto"), use it and STATE it in your kickoff line (e.g. "using glm (default)"). 3. Else (several providers, none default) ask the user ONCE which to use — list the enabled providers from `cc-fleet list --json` (name + default_model + the one-line note in cc-fleet-shared/providers.md). After they pick, run `cc-fleet default <chosen>` so you never ask again. (`cc-fleet default <p>` is user-layer; only run it to FILL a blank default, never with --force.) 4. A mid-task provider failure (insufficient balance / rate limit / auth) → STOP, tell the user what happened, propose the next provider, and WAIT for their confirmation. Never switch providers silently. Model tier within a provider: fan-out / leaf work → omit `--model` (or `--model fast`); judge / synthesis / sustained work → `--model strong`. The provider's roster decides the actual model — see cc-fleet-shared/providers.md. **The reserved native leaf — `claude`.** `cc-fleet subagent claude …` runs the official `claude` CLI on the user's OWN Claude Code login (their subscription OAuth or whatever they're logged in as) — no providers.toml row, no profile, no key material. A NAMED deliberate choice, never the auto-default (`cc-fleet default claude` errors; it never auto-resolves). `--model` takes a literal id (`opus` / `sonnet` / a full id) — the slot keywords `default`/`strong`/`fast` are rejected (no roster); omitted = claude's login default, which is typically the most expensive tier, so naming a model is usually wise. Profiles apply unchanged. The leaf spends the **lead session's own subscription window** — use it for one or two synthesis / judgement nodes, never a wide fan-out. ```bash cc-fleet subagent claude --model opus --prompt "<synthesis over the gathered notes>" --json ``` ## Calling it (run via Bash, always with `--json`) ```bash cc-fleet subagent --prompt "<task>" --json # no provider arg → default provider cc-fleet subagent deepseek --model strong \ --prompt "Analyze the worst-case complexity of quicksort in src/sort.go; give a triggering input" --json ``` The provider arg is **optional** — with no provider, cc-fleet uses the default. `NO_DEFAULT_PROVIDER` / `DEFAULT_PROVIDER_DISABLED` / `DEFAULT_PROVIDER_UNKNOWN` in the failure envelope mean there is no usable default — apply the provider ask ladder above. **Session grouping — no flag needed by default.** `cc-fleet subagent` auto-detects the parent Claude session (fail-closed: an unvalidatable registry shows `(no session)`, never a guess). The one exception: inside a known team, pass `--lead-session-id` (the team's `leadSessionId` from `~/.claude/teams/<team>/config.json`) to force the job under that team's session — an explicit flag always wins. ```bash # Optional explicit override, with a known team: lead_session_id=$(jq -r '.leadSessionId // empty' "$HOME/.claude/teams/<team>/config.json") cc-fleet subagent --prompt "..." --lead-session-id "$lead_session_id" --json ``` Useful flags (full list in cc-fleet-shared/cli-reference.md): - **Name it** → `--label "<short-alias>"` (e.g. `--label sort-complexity`). The Agents Board shows the label instead of the opaque job id — pass one on every launch, like a teammate name. Display-only metadata; capped at 256 bytes. - **Large / sensitive prompt** → `--prompt-file <path>` (read from file, piped via stdin, kept out of argv / `ps`). Use it once a single prompt approaches **~128 KiB** (`MAX_ARG_STRLEN`, the per-argument cap — not the ~2 MB total `ARG_MAX`). `--prompt-file -` reads stdin. - **Long task** → `--timeout 600s` (default 300s). For tasks that may exceed the timeout, run the sync call in a backgrounded Bash, or use `--background` (both below). Note: a provider that's down on **auth (401) or quota (429)** makes claude retry **~180s** before surfacing `KEY_INVALID` / `INSUFFICIENT_BALANCE`, so keep `--timeout ≥ ~200s` (the 300s default is fine) — a shorter timeout reports those as `SUBAGENT_TIMEOUT` instead. `--probe` does **not** catch a bad key (the models endpoint may not 401 it). - **Cost / runaway gates** → `--max-budget-usd 0.5` (cap spend) and `--max-turns 8` (cap the agentic tool loop). On fan-out, strongly consider passing these on every call. - **Prompt profile** — `slim` is the DEFAULT; read-only research → `--profile slim-ro`; `--profile full` ONLY to compare against a full session or diagnose a suspected slim regression. `--tools` REPLACES the whole tool set, never appends. Write prescriptive prom
Use to watch the whole cc-fleet fleet live from inside this session — it surfaces a running fleet in the agent panel and streams the status board (provider teammates, one-shot subagent jobs, and workflow runs) until interrupted.
Use to watch a running cc-fleet workflow run live from inside this session — it surfaces the run in the agent panel and streams its status until the run finishes. Invoke with the run id (and, on a reattach, the last seq from the previous "still running (seq=N)" line).
Run cc-fleet's setup/health diagnostics and explain any failures (read-only)
List cc-fleet provider teammates, async jobs, and pane health (read-only)
Spawn long-lived provider LLM teammates in tmux panes that you message via the native agent-team tools — multi-turn, collaborative, watchable. Use for sustained parallel build/work ("spawn workers", N teammates on N files), or when you need a collaborator you message across turns. NOT a fire-and-forget one-shot or a flat batch of independent prompts — that is /cc-fleet:subagent. NOT a scripted multi-phase run — that is /cc-fleet:workflow.
Orchestrate a MULTI-PHASE, dependent, or resumable run over many provider subagents from a JS script, off the main context (`cc-fleet workflow`). Use for fan-out→barrier→synthesis, per-item pipelines, loop-until-dry, or a run that must survive a kill and `--resume` from its journal. NOT a flat fan-out of independent tasks (that is /cc-fleet:subagent — cheaper, no script); NOT interactive collaboration (that is /cc-fleet:team); NOT trivial single-shot work for the main session.