Why HTML Could Be Better Than Markdown as Claude Output

For years, asking for responses in Markdown has been the reflexive habit of anyone working with LLMs. It made historical sense: when GPT-4 operated with 8,192-token windows, Markdown's efficiency over HTML made a real difference in cost and what fit in each call. That context no longer exists, and an article published this week makes the case with enough weight to deserve attention.

Thariq Shihipar, an engineer on Claude Code at Anthropic, published on May 8 "The Unreasonable Effectiveness of HTML", a piece advocating for HTML as the preferred output format when working with Claude. Simon Willison picked it up on his blog, admitting it made him reconsider habits from the GPT-4 era.

The central argument

Shihipar's thesis isn't that Markdown is bad. It's that with context windows of 1M tokens, like those offered by Claude Opus 4.7, the efficiency argument that justified Markdown has evaporated. HTML, by contrast, enables presentation structures that Markdown simply cannot express: inline annotations in diffs, severity-coded highlighting, tables with real semantics, visual design embedded in the artifact itself.

The example prompt circulating in the article illustrates the kind of output it enables:

> Help me review this PR by creating an HTML artifact that describes it. I'm not very familiar with the streaming/backpressure logic so focus on that. Render the actual diff with inline margin annotations, color-code findings by severity and whatever else might be needed to convey the concept well.

That's something a Markdown block with lists and asterisks simply cannot reproduce with fidelity. The resulting HTML is, literally, a navigable review document, not plain text with some formatting.

Why it matters in the Claude Code context

The detail that Shihipar works on the Claude Code team is significant. Claude Code already supports HTML artifacts as native output, and the CLI can render them or pass them as input to other steps in a pipeline. Combining this with sub-agents that generate intermediate HTML reports, or hooks that process those artifacts on PostToolUse events, opens workflows that would be awkward or outright impossible with Markdown.

The examples collected on the article's examples site show concrete cases: from embedded data visualizations to code review interfaces with interactive filters. These aren't lab demos; they're the kind of artifact an engineering team could use directly in their review workflow.

For whom this changes things

The shift in thinking is more relevant for certain profiles than others:

Engineers using Claude Code to generate reports, PR reviews, or log analysis: HTML gives them control over presentation without post-processing plain text.
Teams building pipelines with sub-agents: an HTML artifact is a richer object that can be routed, stored, and displayed without additional transformation.
Developers of skills and plugins: if a skill's output is HTML, the skill's consumer gets something immediately usable.

For those using Claude conversationally in the web chat, the impact is smaller, though Claude.ai already renders HTML in artifacts natively.

The habit that's hard to break

Willison is honest about it: spending years asking for Markdown by default creates inertia. Not because Markdown is the best option, but because it's the familiar option. Shihipar's article serves as a reminder that the assumptions that formed those habits, short contexts, expensive tokens, models generating verbose and fragile HTML, no longer hold true.

Current models generate structurally sound HTML with enough consistency to rely on it as an output format. And current context windows make the token overhead, in most cases, irrelevant.

---

From ClaudeWave, this seems like a useful and well-argued reframing. It's not an absolute recommendation, there are contexts where Markdown remains the right choice, but the question "why not HTML?" deserves to be asked more often than it is.

Why HTML Could Be Better Than Markdown as Claude Output

The central argument

Why it matters in the Claude Code context

For whom this changes things

The habit that's hard to break

Sources

Read next

Reinventing the Wheel Makes More Sense Than It Seems

Claude usage limits push users toward cheaper Chinese alternatives

I Will Never Use AI to Code: Why This Argument Still Matters