For the past two years, the technology industry has raced to make AI agents more capable — teaching them to write code, navigate software interfaces, manage files, and orchestrate multi-step workflows with increasing autonomy. What the industry has not done, at least not with any consistency, is answer the question that keeps chief information security officers awake at night: what happens when an agent goes wrong?
On Tuesday at its annual Build developer conference, Microsoft offered what may become the definitive answer. The company introduced Microsoft Execution Containers, or MXC — a policy-driven execution layer, built into the Windows operating system itself, that lets developers and IT administrators declare exactly what an AI agent can and cannot access, with those boundaries enforced at runtime by the OS kernel.
The announcement, buried within a sweeping set of developer-focused updates, is arguably the most consequential platform move Microsoft made at Build this year, and it has the potential to reshape how every enterprise on Earth thinks about deploying autonomous AI software.
MXC is not a product you buy. It is an SDK and a policy model — a foundational primitive embedded in Windows and the Windows Subsystem for Linux — that provides what Microsoft calls a “composable sandbox spectrum.” That spectrum ranges from lightweight process isolation, already adopted by GitHub Copilot’s command-line interface, all the way up to micro-virtual machines, Linux containers, and full cloud instances running on Windows 365.
The system separates an agent’s execution from the user’s desktop, clipboard, user interface, and input devices. Critically, it binds every agent to a strong identity — either a local ID or a cloud-provisioned identity backed by Microsoft Entra — so that every action the agent takes can be attributed, audited, and governed.
The implications are enormous. Until now, the enterprise deployment of AI agents has been stuck in a paradox: the more autonomous and useful an agent becomes, the more dangerous it is to let it operate on a corporate network without guardrails. MXC is Microsoft’s attempt to break that paradox — not by making agents less capable, but by making the environment they operate in fundamentally more controlled.
Why every autonomous AI agent is a security incident waiting to happen
To understand why MXC matters, consider what an AI agent actually does when it runs on your computer. Unlike a traditional application, which operates within well-understood boundaries — a word processor reads and writes documents, a browser fetches web pages — an AI agent is, by design, unpredictable. It receives a goal in natural language, reasons about how to achieve it, and then takes actions: opening files, executing code, calling APIs, browsing the web, interacting with other software. Each of those interactions creates what security professionals call “attack surface.”
Microsoft’s own blog post framed the challenge in stark terms. The company wrote that “as agents become more capable and autonomous, they’re delivering material productivity gains. But they’re also introducing new risk, and the issue isn’t just the agent. It’s the entire system the agent operates across.” Every interaction between agents and humans, tools, applications, models, and other agents “exposes new attack surface and introduces different failure modes.” Microsoft characterized this as “a multi-layer systems problem.”
This is not a theoretical concern. In the months leading up to Build, security researchers demonstrated numerous ways that AI agents could be manipulated — through prompt injection, through malicious tool calls, through data exfiltration disguised as normal workflow. For enterprises that handle sensitive data, proprietary models, and regulated information, the absence of a trusted execution environment has been the single biggest barrier to moving agents from demo to deployment.
Microsoft’s answer is a sandbox that scales from a single process to a full virtual machine
MXC operates on a deceptively simple principle: declare what the agent can do before it runs, and let the operating system enforce those declarations at runtime. A developer or an IT administrator writes a policy that specifies which files, directories, and network resources an agent is allowed to access. MXC then creates a contained execution environment — a sandbox — that enforces those boundaries regardless of what the agent attempts to do.
What makes MXC unusual, and potentially very powerful, is the breadth of its isolation options. Microsoft designed the system so that a single SDK and policy model can map to the appropriate isolation construct for any given workload. For a lightweight coding assistant that just needs to read the current project directory, fast process isolation may be sufficient. For an autonomous agent that executes arbitrary code downloaded from the internet, a full micro-VM may be required. The system is designed to be “dynamically composable based on intent and risk,” meaning that the level of isolation can be adjusted based on what the agent is actually doing, not just what category it falls into.
Session isolation is a particularly important feature. MXC separates the agent’s execution from the user’s desktop, clipboard, UI, and input devices. This directly mitigates several classes of attacks that security researchers have identified as particularly dangerous for AI agents: UI spoofing, where an agent manipulates what the user sees to trick them into approving a malicious action; input injection, where an agent sends keystrokes or mouse clicks to other applications; and cross-session data leakage, where information from one user’s session bleeds into another.
A live demo showed an AI agent trying to delete files — and failing, because the OS wouldn’t let it
During a pre-briefing with VentureBeat the night before the announcement, a Microsoft developer offered a vivid demonstration of the technology in action. He had set up the open-source agent framework OpenClaw running inside MXC’s sandbox on his personal development machine. He then instructed the agent to delete all the files on his desktop. The agent attempted to comply — but the sandbox prevented it. “If you look at my desktop here, you see how clean my desktop is,” the developer said during the demo. “That’s a lie.” The files, he explained, were completely safe because “the container won’t allow it.”
The demonstration went further, showcasing the granularity of MXC’s controls. Users can mark specific files as read-only for the agent, restrict access to the browser and screen capture, control whether the agent can see location data, and have all of those permissions managed centrally by an enterprise IT department through Intune policies. The agent operates inside what is effectively a one-way mirror: it can do the work it has been asked to do, but it cannot see or touch anything outside the boundaries that its policy defines.
Pavan Davuluri, Microsoft’s Executive Vice President for Windows and Devices, underscored during the pre-briefing that the primitives MXC introduces — security, containment, isolation, and user control — are essential to making AI agents commercially viable.
He emphasized that these capabilities are “not unique to OpenClaw” and that “this pattern repeats itself over and over” for any agent running on a Windows device. The primitives that exist in the operating system now “for the file around security, containment, isolating them, having users in control,” he said, are what will make agents safe enough for ordinary consumers and corporate deployments alike.
Defender, Entra, Intune, and Purview integration arriving in July turns MXC into an enterprise control plane
For corporate IT departments, the most significant element of the MXC announcement is not the SDK itself but its integration with Microsoft’s existing enterprise security stack through what the company calls Agent 365. Arriving in preview in July, Agent 365 layers Microsoft’s Entra identity service and Intune device management platform on top of MXC, so that IT administrators can govern agent containment centrally while developers choose the level of isolation their workload demands.
The integration goes further: Microsoft Defender will provide runtime threat protection, Entra will handle identity and access management, Intune will enforce device-level policies, and Microsoft Purview will extend its data governance and compliance capabilities to agent activity. This means that an enterprise could, in theory, allow employees to run AI agents on their corporate machines — even powerful, autonomous agents that execute code and manage files — while maintaining the same kind of centralized visibility and control that IT departments currently have over traditional applications.
Microsoft described the identity layer in its official blog: “Windows assigns agents a local ID or a cloud provisioned identity backed by Entra and attributes all activity from the container to that identity, so you can clearly differentiate human from agent.” For regulated industries — financial services, healthcare, government — the ability to produce an audit trail that distinguishes between human actions and agent actions on the same machine could prove to be a regulatory requirement, not merely a nice-to-have feature. Every agent action attributable to a specific identity, every containment boundary enforceable through the same policy infrastructure that already governs hundreds of millions of Windows devices — this is the architecture that could finally move AI agents from pilot programs to production.
OpenAI, Nvidia, Manus, and Nous Research are already building on MXC — and that changes the calculus
Platform announcements at developer conferences are often aspirational. What distinguishes the MXC launch is the breadth and specificity of the partners already building on it. Microsoft named five: OpenAI, Nvidia, Manus, Nous Research (maker of the Hermes agent), and the OpenClaw open-source project. Each is integrating MXC in a distinct way that illuminates a different use case for the technology.
OpenAI’s involvement is particularly striking. David Wiesen, a member of OpenAI’s technical staff, said that “working with Microsoft on the Microsoft Execution Containers (MXC) allows us to explore new patterns for AI agents to safely and efficiently generate and execute code.” He added that by combining Codex’s capabilities with MXC’s execution environment, the goal is “to help developers move from intent to reliable execution faster, while maintaining the security and control enterprises need.” The reference to Codex — OpenAI’s code-generation agent — suggests that MXC could become the default execution environment for one of the most widely anticipated agent products in the industry.
Nvidia is bringing its OpenShell framework to Windows built on MXC, providing what Microsoft described as “an easy-to-deploy package for autonomous, always-on agents safely.” Manus, the Chinese-born AI agent startup that gained viral attention earlier this year, is also integrating. Tao Zhang, Manus’s Chief Product Officer, said that MXC “gives developers a policy-driven way to define what an agent can access and enforce those boundaries at runtime, so more autonomous agents can operate safely in enterprise environments.” And Dillon Rolnick, the CEO of Nous Research, offered what may be the most concise articulation of why MXC matters: “Continuously-running local agents, like Hermes Agent, require intentional isolation. Developers need control over what an agent can access and trust that those controls will hold.”
How an open-source agent framework became Microsoft’s proving ground for AI safety on Windows
One of the more revealing stories behind the MXC announcement involves OpenClaw. During the press pre-briefing, a Microsoft developer described how the partnership came together organically — Peter Steinberger, OpenClaw’s creator, sent him a direct message in January expressing interest in collaborating. What began as a casual conversation evolved into a full-fledged platform partnership, with Microsoft developers contributing to the OpenClaw Windows companion app, built as a native WinUI application rather than a wrapped web app.
The OpenClaw integration serves as what Scott called “the ultimate test app for all the stuff that [the Windows platform team] is making.” If OpenClaw — which by its nature gives agents broad autonomy to execute tasks on a user’s machine — can run securely within MXC’s containment boundaries, then the containment system is robust enough for any agent. Scott explained the philosophy driving the work: “Think of OpenClaw Windows as the ultimate test app… If OpenClaw can succeed on Windows, that means that the Linux support is there, the container support is there, the containment is there.”
The companion app demonstrates the full spectrum of MXC’s enterprise controls — file permissions, network access, screen capture restrictions, location data — all manageable centrally through Intune policies. Microsoft donated the project to OpenClaw and plans to continue contributing to it as open source. As one member of the Windows leadership team put it during the briefing: “All agents, all comers, everyone is welcome on Windows… It’s going to run great on Windows, because the primitives are there. The base of the pyramid is solid.”
Building containment into the OS gives Microsoft a strategic edge over Apple’s walled garden and Google’s cloud-first model
MXC arrives at a moment when the technology industry is grappling with a fundamental tension. AI agents represent what may be the most significant new category of software since mobile applications, and every major technology company is racing to build them. But the security and governance infrastructure required to deploy these agents responsibly in enterprise environments barely exists. Microsoft’s approach is distinctive because it locates the trust layer at the operating system level rather than in the agent framework, the model provider, or a third-party security product.
This is a deliberate architectural choice. By building containment into Windows itself, Microsoft ensures that the security guarantees hold regardless of which agent, which model, or which framework a developer chooses.
It also means that the hundreds of millions of Windows devices already managed through Intune and secured through Defender can, in principle, become agent-ready through a software update rather than a rip-and-replace deployment.
Apple’s approach to AI agents leans heavily on its walled-garden ecosystem, offering security through restriction — limiting which agents can run and what they can do. Google’s approach, centered on its cloud infrastructure, offers security through centralization. Microsoft’s approach offers security through declaration and enforcement — allowing any agent to run, but containing its impact through OS-level policy.
For enterprises that operate in heterogeneous environments with diverse toolchains and multiple AI providers, the Microsoft model may prove the most practical. The competitive dynamics are already shifting: with OpenAI’s Codex, Nvidia’s OpenShell, and independent agent frameworks like Manus and Hermes all building on MXC, Microsoft is positioning Windows not just as the platform where agents run, but as the platform where agents can be trusted to run.
The hardest part isn’t building the sandbox — it’s writing the policies that go inside it
MXC is available now in early preview, meaning developers can begin building against the SDK and testing containment policies. The Agent 365 integration with Defender, Entra, Intune, and Purview is scheduled for preview in July — a timeline aggressive enough to suggest that much of the engineering work is already done, but far enough out to allow for refinement based on developer feedback.
The real test, however, will come when enterprises begin deploying agents at scale on production networks. Containment is only as good as the policies that govern it, and writing effective agent policies for complex enterprise environments will be an entirely new discipline — one that IT departments have not yet developed and that no vendor has yet figured out how to teach. The technology is promising, but an empty sandbox is just an empty box. Filling it with the right rules, for the right agents, in the right contexts, will require a level of organizational sophistication that most companies are only beginning to contemplate.
Still, the significance of what Microsoft announced on Tuesday is difficult to overstate. For the first time, a major operating system vendor has proposed a comprehensive, kernel-level answer to the question of how autonomous AI software should be contained, identified, and governed on the devices where most of the world’s work actually gets done. The industry spent two years teaching agents to act. Microsoft is now betting that the bigger business — and the harder engineering problem — is teaching the operating system to watch.