Catch your AI agents when they lie about what they shipped — verifies claims against git instead of believing the agent.
- ✓Open-source license (MIT)
- ✓Actively maintained (<30d)
- ✓Clear description
- ✓Topics declared
git clone https://github.com/anthony-chaudhary/dos-kernel && cp dos-kernel/*.md ~/.claude/agents/23 items en este repositorio
Adjudicate a GitHub issue's "this is resolved" claim from witnesses the claimant didn't author — then close it carrying the evidence, or refuse with the typed gap. Use when an issue looks already-solved, after landing a fix that should have closed one, or to sweep open issues for silently-resolved ones.
Pick the next most important open GitHub issue this agent can actually complete, make its done-condition true, land it with witnesses (suite + parity + commit-audit), and priority-tag every issue touched along the way. Use when asked to "work the backlog", "complete the next most important issue", or to fix a specific issue number end-to-end.
Cut a versioned release of the DOS kernel — bump the version, draft release notes, commit, tag, push to master, and create a GitHub release. The tag push triggers the gated PyPI publish pipeline (publish.yml); the skill surfaces the run and its approval gate.
Promote an already-shipped rolling release (vX.Y.Z) of the DOS kernel to a named stable channel — gated on a green kernel suite + a green third-party CI run on the candidate + a clean truth syscall + a soak window. Writes an evidence file and adds a stable/<codename> git tag on the same commit. Does NOT bump versions or build new artifacts.
One automatic plan-class lifecycle tick. Reads the DECLARED class set + transition list from the workspace `[lifecycle]` table (not a hardcoded taxonomy), evaluates each trigger, spawns a read-only JUDGE-rung adjudicator (the `dos.judges` seam — advisory, fail-to-abstain) to approve/defer each candidate transition, applies the gated transitions as plan-meta edits + one commit per cycle, and logs to the run archive. Failsafes (per-cycle cap, per-plan cooldown, a veto class) are `[lifecycle]` data; the judge content is a host `dos.judges` driver. Every path/class comes from `dos doctor --json`. Use to garden a plan portfolio's lifecycle automatically, judge-gated. The DOS lifecycle gardener (SKP Axis 5, docs/207 Phase 5c).
Run /dos-dispatch on a recurring cadence, alternating with /dos-replan when the backlog drains — the dispatch→replan→dispatch cycle. The continue/stop/next-mode decision is the kernel's typed loop decision, not inline prose: each iteration is classified (`dos gate`) into a verdict and the loop's counters (drained-twice, the unclear/dirty-zero breakers, the iteration cap) drive the next step. Several loops on disjoint lanes run concurrently, each taking its own lane lease via `dos arbitrate`. Driven entirely by `dos` verbs + the workspace's `dos.toml`. The DOS reference loop workflow (SKP Axis 5).
End-to-end plan-and-ship for one lane — snapshot the portfolio with /dos-next-up, take a lane lease via `dos arbitrate` so parallel dispatches don't collide, gate the empty case via `dos gate`, ship the packet, and archive the run under the configured run dir. Driven entirely by `dos` verbs + the workspace's `dos.toml`; names no host path, lane, or commit convention. Use when you want to plan and ship the next batch on one lane in a single command, with concurrency safety. The DOS reference dispatch workflow (SKP Axis 5).
Ground a "keep working until the goal is met" stop condition in a witness the agent did not author, instead of letting the agent self-certify "done". A harness goal/Stop-hook condition is normally checked by the model re-reading its OWN work — consistency, not grounding. This skill turns the operator's goal into checkable EFFECT claims and wires `dos hook stop` so the Stop is refused until git ancestry (a shipped phase) or an effect read-back corroborates the claimed effect. Driven by `dos` verbs and the workspace's own `dos.toml` — no host-specific paths, lanes, or commit conventions. Use when you want a self-stopping agent (or a `/loop` worker) to be unable to declare a goal complete on its own say-so. The single-agent self-stop analogue of `dos-witness-claim`.
Snapshot a repo's phased-plan portfolio and produce a parallel-agent dispatch packet, driven entirely by `dos` verbs and the workspace's own `dos.toml` — no host-specific paths, lanes, or commit conventions. Walks the configured plans glob, audits each candidate pick against `dos verify` for its true shipped/unshipped status, renders a self-contained packet to the configured output dir, and reports a typed gate verdict via `dos gate`. Use when you want a "where are we / what's next / who-does-what" snapshot of any repo that has a few plan docs and real commits. This is the DOS reference workflow (SKP Axis 5); a host may use it, fork it, or ignore it.
The visibility-inverse of lifecycle-demote. Run `dos pickable` over every declared unit; for each HELD unit, surface it with its typed HoldReason and the derived unblock action (DRAFT_CLASS→promote-to-active, UNPARSEABLE→inspect-the-deriver, OPERATOR_GATED→raise-a-decision, SOAK_OPEN→wait, DEPENDENCY_UNMET→ship-the-prerequisite). The only auto-applied action is a safe mechanical reclassify (gated, one commit); everything else is surfaced for a human via `dos decisions`. Every path/lane/class comes from `dos doctor --json`. Use when units are stuck un-pickable and you want each one's typed reason + the right unblock move. The operator-facing half of the shipped `pickable` gate (SKP Axis 5, docs/207 Phase 5b).
Run /dos-replan on a fixed cadence for a bounded number of iterations, then stop — an unattended planning-refresh sweep. A thin recurring wrapper over /dos-replan plus an optional guarded release; the release guard reads the workspace's trunk from config rather than assuming a branch name. Driven by `dos` verbs + the workspace's `dos.toml`. The DOS reference planning-loop workflow (SKP Axis 5).
Garden a repo's plan portfolio from accumulated evidence — detect closures (a queue item whose phases now `dos verify` as shipped is done), track cooldown state, and surface the 0-2 items the operator must actually decide via the `dos decisions` queue. Read-only on code/data; writes only its queue + cooldown state. Driven by `dos` verbs + the workspace's `dos.toml`; names no host path or convention. Use after a burst of dispatches, when the backlog looks drained, or when recurring findings start hurting throughput. The DOS reference planning-sweep workflow (SKP Axis 5).
Run a self-improving work loop where the kernel — not the agent's say-so — decides whether each candidate change actually improved the codebase. The propose→verify→measure→keep-or-revert cycle with one rule no prior auto-improver enforced: a candidate is KEPT only if a witness the candidate's author did not write CONFIRMS it improved (the test suite green on a clean worktree, the truth syscall clean, and a strictly-measured metric gain). Otherwise it is REVERTED. The keep/revert/escalate decision is the kernel's typed `improve` verdict (`dos improve`), not inline prose; a run of non-keeps trips a breaker that ESCALATEs to a human. Each candidate is applied in an ISOLATED git worktree so the kernel adjudicating it is never the kernel being rewritten. Driven entirely by `dos` verbs + the workspace's own `dos.toml` — no host-specific paths, lanes, or commit conventions. The DOS reference recursive-self-improvement loop (SKP Axis 5, docs/280).
One-time check that the DOS kernel plugin is ready to use — confirm the `dos-kernel` Python package is importable (the hooks and MCP server need it), report what the plugin bundled (hooks, the `dos` MCP tools, the generic skill pack), and point at the next skill to run. Use right after installing the dos-kernel plugin, or when `/mcp` shows the `dos` server failing to start or a `dos hook` command erroring. Read-only: it runs `dos doctor` to confirm wiring; it installs nothing and changes no config.
Show what the bundled DOS hook binary has been doing — fold its per-call observation log into an at-a-glance report (how many tool calls it adjudicated, how many it DENIED / WARNED / passed through, which reason classes fired, how often verify-on-stop blocked a false \"done\", the wait-marker budget, and per-verb latency). Use when you want to see the trust substrate's OWN activity on this project, confirm the native fast-path is actually serving calls (not silently delegating to Python), or check how fast the hooks run. Read-only: it folds a log the hooks already wrote; it takes no lease, launches nothing, and changes nothing.
Keep a target population of worker dispatch-loops alive across a workspace's lane roster — the supervisor cadence (the init/PID-1 analogy for a fleet). Each tick reads the active lane taxonomy from `dos doctor --json`, asks the kernel for a spawn/reap/flag plan via `dos loop --target N --json`, launches one `/dos-dispatch-loop` per SPAWN, scavenges only STALLED leases, and SURFACES (never kills) a SPINNING worker. The spawn/reap/flag decision is the kernel's typed `supervise()` verdict, not inline prose — the supervisor only carries out the plan. Driven entirely by `dos` verbs + the workspace's `dos.toml`. The DOS reference supervisor workflow (SKP Axis 5).
Sweep the run-archive trail of BLOCKED/DRAIN verdicts, normalize each to a canonical cause via the recurring-wedge fold, cluster by recurrence × stall-cost, and propose ONE structural fix per recurring cause — a contract/oracle/preflight change, never a one-off unblock. Read-only on code; surfaces via `dos decisions`. The cause taxonomy is `[reasons]` data; every path/lane comes from `dos doctor --json`. Use when a fleet keeps stalling on the same thing across runs and you want the structural fix, not another manual unblock. The DOS operator remediation sweep (SKP Axis 5, docs/207 Phase 5a).
Route a subagent's actionable claims through the witness rung instead of folding its return string. For any worker whose deliverable is a CHECKABLE EFFECT — a shipped git phase, a created file, a DB row, a sent message — do NOT believe what the worker said it did; extract the claim at the boundary, gather an independently-authored read-back, and fold ONLY the confirmed effects. Driven by `dos` verbs and the workspace's own `dos.toml` — no host-specific paths, lanes, or commit conventions. Use at a `parallel()`/`pipeline()` barrier, a synthesis step, or any fold site where one agent's output becomes another's input. This is the DOS reference pattern for the docs/197 §7(2) witness-routing stage; the seam below is honest about which steps have a CLI verb and which are Python-API-only today.
Before accepting an agent's 'done / shipped / fixed' claim, verify it against ground truth (git ancestry + the commit's own diff) using the DOS kernel's `dos verify` and `dos commit-audit` — never the agent's own narration.
Run an always-on, self-tuning loop that improves DOS's OWN enforcement policy from observed outcomes — closing the 'sound PDP with no PEP feedback' gap. The loop mines the enforcement journal for FALSE-DENIES (a deny the operator later overrode) vs HELD catches, proposes ONE edit to the enforcement policy KNOBS (the confidence-gating thresholds + intervention-ladder ranks, never the enforcement LOGIC), re-scores in an isolated worktree, and KEEPs the edit only if a witness the loop did not author confirms it: the suite green, the truth syscall clean, and a strictly-higher kernel-measured net_task_delta. Otherwise it is REVERTED. The keep/revert/escalate decision is the kernel's typed `improve` verdict via `dos enforce-tune`, not inline prose; a run of non-keeps trips a breaker that ESCALATEs to a human. Run autonomously on a cadence, a KEEP verdict is auto-merged — made safe by the non-forgeable keep-bit + a runtime-logic rail that reverts any candidate touching the kernel's adjudication code regardless of its metric. Driven entirely by `dos` verbs + the workspace's own `dos.toml` — no host-specific paths. The DOS reference enforcement-policy self-tuning loop (docs/365).
Launch N independent headless worker instances in account-balanced waves, each armed with ONE goal whose "done" is gated on a witness the worker did not author — not on its own say-so. Each objective becomes one self-stopping child wired to `dos hook stop` (the `dos-goal-gate` discipline), co-launch safety comes from `dos arbitrate` over each child's file-tree, and every claimed ship is confirmed by `dos verify` / `dos commit-audit`, never by a transcript line. Driven by `dos` verbs and the workspace's own `dos.toml` — names no host path, runtime, model, or account mechanism. Use when an operator hands a context with several independent objectives and says "launch a worker per goal", "fire a wave of goal agents", or "run these N goals in parallel". The fan-OUT analogue of `dos-goal-gate` (one self-stopping leaf) and `dos-witness-claim` (the fold).
Price a PROPOSED multi-agent fan-out BEFORE any worker launches — from the kernel's own agent-blind file-tree geometry, not a self-report. Given the partition a fan-out is about to hand its workers (N agents x declared trees), compute the collision graph, the true collision-free maximum concurrency, the safe set to run now, and the cheapest disjoint re-partition — so a colliding plan is refused with zero agents launched instead of discovered after the Kth acquire already raced. Driven by `dos` verbs and the workspace's own `dos.toml` — no host-specific paths, lanes, or commit conventions. Use before a `dos-goal-fleet` wave, before dispatching a `dos-next-up` packet, or any time an operator says "fan these N agents out over these trees" and you want the price before the launch. The PREDICTIVE complement to `dos arbitrate` (EXAMPLES.md R2), which is the reactive floor that refuses one colliding acquire at a time.
Convert any agent skill into a trust-grounded variant — read the skill, find every place it believes a worker's word (a self-certified "done", an ungrounded "it shipped", a blind file-edit, a filter(Boolean) fan-out fold), and emit an ADDITIVE new-copy variant whose trust seams shell a `dos` verb and read the verdict, plus a re-derivable conversion report. Driven by `dos` verbs and the workspace's own `dos.toml` — no host-specific paths, lanes, or commit conventions. Use when an operator hands you a skill and says "make this DOS-aware", "ground this skill's self-checks", or "what would DOS add to this skill". The converter is itself a judged agent: it abstains on an unwitnessable claim and its output is admitted by witness, never by its own say-so.
Resumen de Subagents
<!-- GENERATED FILE — do not edit README.md directly.
The source of truth is docs/readme/ (one file per section, assembled
in filename order). Edit the part, then run:
python scripts/build_readme.py
tests/test_readme_assembly.py pins this file to the parts. -->
# DOS — the Dispatch Operating System
> ### Catch your AI agents when they lie about what they shipped.
[](https://pypi.org/project/dos-kernel/)
[](https://pypi.org/project/dos-kernel/)
[](https://github.com/anthony-chaudhary/dos-kernel/actions/workflows/ci.yml)
[](https://github.com/anthony-chaudhary/dos-kernel/actions/workflows/dos-gate.yml)
[](https://github.com/anthony-chaudhary/dos-kernel/blob/master/docs/scoreboard/methodology.md)
[](https://github.com/anthony-chaudhary/dos-kernel/blob/master/LICENSE)
> 📊 **See it run on real repos:** the **[scoreboard](https://anthony-chaudhary.github.io/dos-kernel/scoreboard/)**
> scores 15 popular AI-built repos (roborev, open-interpreter, crewAI, autogen, …)
> — how much agents wrote, which ones, and whether each commit's claim is backed
> by its own diff. Score yours: `dos commit-audit --sweep --workspace . BASE..HEAD`.
<p align="center">
<img src="https://raw.githubusercontent.com/anthony-chaudhary/dos-kernel/master/docs/assets/caught-lie-cast.svg" alt="A terminal recording of the caught lie. The agent reports: Done! Shipped the login endpoint (AUTH1) and the password reset (AUTH2). git log shows one commit — AUTH1: ship the login endpoint. dos verify AUTH AUTH1 answers SHIPPED (exit 0); dos verify AUTH AUTH2 answers NOT_SHIPPED via none (exit 1) — caught. The exit code is the verdict: gate the agent's done on it and a false claim cannot land." width="100%">
<br>
<em>The whole pitch in one recording: the agent claims two features shipped; git backs one.
<code>dos verify</code> answers from the commits, the lie exits <code>1</code>, and a gate on that
exit code refuses the false "done". Every line is the real CLI's verbatim output —
<a href="https://github.com/anthony-chaudhary/dos-kernel/blob/master/scripts/build_caught_lie_cast.py"><code>scripts/build_caught_lie_cast.py</code></a> re-records it whenever the output changes.</em>
</p>
<p align="center">
<img src="https://raw.githubusercontent.com/anthony-chaudhary/dos-kernel/master/docs/assets/loop-hero.svg" alt="Two agent fleets side by side. Left, no referee: agents all report 'done!', every report is believed, and silent corruption (lies, collisions, spin) piles up into a codebase that 'sorta works' and can't be changed. Right, DOS adjudicates: dos verify reads git and the run branches to SHIPPED (exit 0, land it) or NOT_SHIPPED (exit 1, re-dispatch — caught), and that verdict steers the next step." width="100%">
<br>
<em>Run a fleet of agents on one repo. The left loop just feels like progress; the right one you can steer.
The only difference is a verdict DOS reads from the real world — here, git — never the agent's word.</em>
</p>
An AI agent will tell you it finished. DOS checks the real world instead of
taking its word — and the nearest piece of the real world is your git history.
An agent says it shipped the login endpoint; did it? Run one command,
`dos verify`, and it answers from the artifacts the work left behind, not from
what the agent typed: a commit backs the claim → `SHIPPED`, exit `0`; nothing
landed → `NOT_SHIPPED`, exit `1`. The agent's story never enters into it. (Git
is just the first witness DOS reads; the file tree, the clock, a CI status, a
test environment's own state are others — anything the agent didn't author.)
```bash
dos verify AUTH AUTH1 # → SHIPPED AUTH AUTH1 e62f74d (exit 0)
dos verify AUTH AUTH2 # → NOT_SHIPPED AUTH AUTH2 (exit 1)
```
That's the smallest version. It scales up, too: point a dozen agents at one
repo — in CI, in a fleet, racing on the same files — and DOS also tells you
which ones are stepping on each other, which one is spinning in circles, and
which claim of "done" is real. Every answer comes from the artifacts (git, the
file tree, the clock), never the narration. It works on a plain `git` repo with
zero config and gets smarter the more you tell it, and the only thing you ever
install is one small Python package.
> ⚡ **Just add it — two commands, zero decisions.** From the repo where your
> agent works:
>
> ```bash
> pip install dos-kernel
> dos init --hooks auto # finds the agent runtime(s) you already use, wires in the checks
> ```
>
> From then on: your agent can't tell you **"done"** unless the work actually
> landed, two agents can't silently overwrite each other's files, and a run
> that stalls gets flagged instead of quietly spinning. Nothing about your
> workflow changes, and you don't need to learn any of the vocabulary below to
> be covered. It prints the one config file it wrote; deleting the `dos hook`
> entries there undoes it. (No runtime detected? It says so and lists the
> names to pick from — it never guesses.)
<sub>**v0.28.0** · 5,600+ tests · CI: Python 3.11–3.13 on Linux + a Windows 3.13
smoke run · the only runtime dependency is **PyYAML** · **MIT**.</sub>
> 🧭 **Where to go next:** the [why & evidence](https://github.com/anthony-chaudhary/dos-kernel/blob/master/docs/guide/why-a-referee.md) (plain-words story, the 20-lines-of-bash answer, what's proven),
> [wire it into your stack](https://github.com/anthony-chaudhary/dos-kernel/blob/master/docs/guide/wire-it-in.md) (MCP · hooks · install), the
> [syscall + CLI reference](https://github.com/anthony-chaudhary/dos-kernel/blob/master/docs/guide/cli-reference.md), or, **reading this as an AI agent?**, [AGENTS.md](https://github.com/anthony-chaudhary/dos-kernel/blob/master/AGENTS.md) — build/test/check in three lines. The full map is the router just below.
> 🔤 **Five words the rest of this page leans on.** A **plan** is a named goal
> (`AUTH`); a **phase** is one shippable step of it (`AUTH1`); a **lane** is the
> slice of the file tree one agent may touch; the **oracle** is the part of DOS
> that reads the evidence and rules; a **stamp** is the mark a shipped phase
> leaves in a commit subject (`AUTH1: …`) — the thing the oracle greps for.
> That's the whole vocabulary.
<a id="who-this-is-for"></a>
<a id="the-plain-words-version"></a>
## In plain words
A coding agent does work, then tells you how it went. Usually the story is true;
sometimes it's the cheerful *"all work completed!"* from a worker that shipped
nothing. With one agent you catch that yourself by re-reading its output — a real
tax you already pay. Run twenty at once and that tax stops being payable: nobody
reads everything, each worker grades its own homework, and the unchecked problems
pile up quietly until the codebase *sorta* works and nobody can safely change it.
DOS is the referee that never reads the story — it reads what happened (the
commit, the file, the clock) and hands you a verdict no narration can move. It
costs about an afternoon, has one runtime dependency, and stays in its lane: it
tells you *what happened*, never whether the code is *good* — quality stays with
your tests and reviews. ([The full plain-words version](https://github.com/anthony-chaudhary/dos-kernel/blob/master/docs/guide/why-a-referee.md#the-plain-words-version).)
## Measured, not asserted
Every number here is scored against a fact the agent can't fake (a test
environment's DB state, git history). A DOS gate caught **15 "I shipped it" lies
in 258 tasks across two models with zero false alarms**; the same referee stopped
**6 of 8** silent collisions on one shared record; quitting doomed runs at the
right moment saved **~11% of fleet compute with 0 of 1,634 winners wrongly
killed**; and the reward-set admission label lifted acceptance precision **60% →
100%** by purging poison a self-graded collector keeps. The methodology, the two
money-moment figures, and the projected-vs-bet honesty gradient are in
**[what's proven and what's still a bet](https://github.com/anthony-chaudhary/dos-kernel/blob/master/docs/guide/why-a-referee.md#whats-proven-and-whats-still-a-bet)**.
## Where the rest of the docs are
This page keeps the hook, the demo, and the failure it fixes. Everything deeper
lives on a focused page — find the question you arrived with and jump:
| You're asking… | Go to |
|---|---|
| *"What is this in plain words, and why should my team care? Is it real?"* | [Why a referee](https://github.com/anthony-chaudhary/dos-kernel/blob/master/docs/guide/why-a-referee.md) — the plain-words story, the 20-lines-of-bash / Temporal answers, and the full proven/bet evidence |
| *"Show me it working, fast."* | [Try it in 60 seconds](#try-it-in-60-seconds), just below — one command |
| *"I already run agents — how do I wire the verdict into **my** stack?"* | [Wire it in](https://github.com/anthony-chaudhary/dos-kernel/blob/master/docs/guide/wire-it-in.md) — MCP, runtime hooks, the exit-code tier, fleet frameworks, and the install matrix |
| *"What's the full command / syscall surface?"* | [The syscall ABI & CLI reference](https://github.com/anthony-chaudhary/dos-kernel/blob/master/docs/guide/cli-reference.md) — every verb, the three live screens, the verdict journal |
| *"I run a fleet every day — how do I watch it, triage it, debug it?"* | [Operating a fleet](https://github.com/anthony-chaudhary/dos-kernel/blob/master/docs/guide/operating-a-fleet.md) + [DLo que la gente pregunta sobre dos-kernel
¿Qué es anthony-chaudhary/dos-kernel?
+
anthony-chaudhary/dos-kernel es subagents para el ecosistema de Claude AI. Catch your AI agents when they lie about what they shipped — verifies claims against git instead of believing the agent. Tiene 8 estrellas en GitHub y se actualizó por última vez today.
¿Cómo se instala dos-kernel?
+
Puedes instalar dos-kernel clonando el repositorio (https://github.com/anthony-chaudhary/dos-kernel) o siguiendo las instrucciones del README en GitHub. ClaudeWave también te ofrece bloques de instalación rápida en esta misma página.
¿Es seguro usar anthony-chaudhary/dos-kernel?
+
Nuestro agente de seguridad ha analizado anthony-chaudhary/dos-kernel y le ha asignado un Trust Score de 87/100 (tier: Trusted). Revisa el desglose completo de comprobaciones superadas y flags en esta página.
¿Quién mantiene anthony-chaudhary/dos-kernel?
+
anthony-chaudhary/dos-kernel es mantenido por anthony-chaudhary. La última actividad registrada en GitHub es de today, con 102 issues abiertos.
¿Hay alternativas a dos-kernel?
+
Sí. En ClaudeWave puedes explorar subagents similares en /categories/agents, ordenados por popularidad o actividad reciente.
Despliega dos-kernel en tu cloud
Lleva este repo a producción en minutos. Cada plataforma genera su propio entorno con variables de entorno editables.
¿Mantienes este repo? Añade un badge a tu README
Pega el badge en tu README de GitHub para mostrar que está auditado por ClaudeWave. Cada badge enlaza de vuelta a esta página y muestra el Trust Score actual.
[](https://claudewave.com/repo/anthony-chaudhary-dos-kernel)<a href="https://claudewave.com/repo/anthony-chaudhary-dos-kernel"><img src="https://claudewave.com/api/badge/anthony-chaudhary-dos-kernel" alt="Featured on ClaudeWave: anthony-chaudhary/dos-kernel" width="320" height="64" /></a>Más Subagents
The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.
The agent that grows with you
Java 面试 & 后端通用面试指南,覆盖计算机基础、数据库、分布式、高并发、系统设计与 AI 应用开发
Production-ready platform for agentic workflow development.
The agent engineering platform.
🤯 LobeHub is your Chief Agent Operator, organizing your agents into 7×24 operations by hiring, scheduling, and reporting on your entire AI team.