Loom: An open-source harness for orchestrating code agents

One of the most concrete challenges when working with code agents isn't the model itself, but the infrastructure surrounding it: how a task is launched, how results are collected, how failures are managed. This week, Loom appeared on Hacker News, an open-source project from valkor-ai designed to address exactly that layer.

The proposition is straightforward: a delivery harness, a wrapper around the coding agent that handles everything before and after the model acts. According to the repository, Loom is not coupled to any specific provider, which in practice means it can be used with Claude Code as well as other agent tools.

What it actually does

The concept of a harness is not new in software engineering — it has been used for decades in testing — but applying it to code agents makes its own kind of sense. Loom defines a layer that manages:

Task delivery: how to specify what the agent should do, in which repository, and under what constraints.
Execution lifecycle: startup, monitoring, and artifact collection (diffs, logs, tests).
Isolation: each task runs in its own environment, preventing one agent's side effects from contaminating the next.

In terms of the Claude ecosystem, this fits naturally with Claude Code's subagent architecture and hooks. The hooks (`PreToolUse`, `PostToolUse`, `Stop`) already allow you to intercept a session's lifecycle, but Loom proposes a layer above that: not just intercepting events, but orchestrating multiple executions in a reproducible way.

Why this approach matters

The dominant trend over the past year has been adding capabilities to the model — wider context windows, better reasoning, more sophisticated MCP server integration. Loom points in another direction: improving delivery infrastructure without touching the model.

This has practical implications for teams already running agents semi-automatically. When a code agent fails midway through a task, the usual question is: did the model fail, did the environment fail, or did the way the task was delivered fail? A standardized harness makes that question easier to answer because it separates each responsibility.

It's also relevant for those building CI/CD pipelines augmented with agents. Instead of embedding orchestration logic inside the prompt itself or using ad hoc scripts, Loom offers a reusable structure.

Who it's useful for right now

The project still has few points on HN and no comments at the time of writing, suggesting it's in very early stages. The code in the repository is functional but documentation is sparse. That said, the intended audience is clear:

Engineering teams already using Claude Code or equivalent agents in real workflows and needing reproducibility.
Evaluation researchers who want to run the same agent across multiple tasks in a controlled manner.
MCP integrators building servers or plugins who need to test them against real agents without spinning up infrastructure from scratch each time.

It's not aimed at the individual user experimenting with Claude locally; it's built for those with agents in production or under systematic evaluation.

A piece of the puzzle, not the whole puzzle

It's tempting to read projects like this as an integral solution to agent complexity. It's not, nor does it claim to be. Loom doesn't solve agent alignment, it doesn't improve the quality of diffs the model produces, and it doesn't substitute for good prompting strategy. What it does do — if it delivers on its promise — is reduce infrastructure noise so the remaining problems are the ones that actually matter.

The discussion on Hacker News is yet to unfold, making it difficult to gauge community reception. We'll keep watching the project.

---

From our perspective, we welcome the emergence of model-agnostic infrastructure layers; the current fragmentation of agent tooling means every team reinvents the same harness. If Loom gains traction and documents its interfaces well, it could become a reference component — though the road from "Show HN" to consolidated tool is long.

Loom: An open-source harness for orchestrating code agents

What it actually does

Why this approach matters

Who it's useful for right now

A piece of the puzzle, not the whole puzzle

Sources

Read next

MCP is becoming the default standard for building agents

AI Toolbox touts support for a Claude Opus version not in the catalog

One Click in the Browser, Context for Any Agent