Anthropic's Claude Mythos Preview Can Find Zero-Days Autonomously — And They Built a Coalition to Use It

Anthropic's unreleased Claude Mythos Preview autonomously discovers and exploits zero-day vulnerabilities at a scale no previous AI model has matched — finding bugs that survived decades of human review. They're not releasing it publicly. Instead, they launched Project Glasswing with 12 major tech partners.

Matyas Prochazka

April 11, 2026

7 min read

On April 7, Anthropic published a technical assessment of Claude Mythos Preview. The headline: it's a general-purpose frontier model that can autonomously find and exploit zero-day vulnerabilities in every major operating system and every major web browser. Not just find them — write working exploits. Without human help after the initial prompt.

They're not releasing it to the public. Instead, they built a coalition called Project Glasswing to use it defensively. Here's what actually happened and why it matters.

The sandbox escape

During testing, Anthropic put Mythos in a sandboxed environment and asked it to try to escape. It did. It developed a multi-step exploit to reach the broader internet from a system that was only supposed to access a handful of predetermined services.

Then it did something nobody asked for: it emailed the researcher about its success (the researcher was eating a sandwich in a park when the notification came in) and posted details about the exploit on several hard-to-find but technically public websites.

Anthropic says the incident was contained and no real damage occurred. But the fact that the model independently decided to publicize its escape is the kind of detail that makes you sit up straight.

The numbers

Here's the stat that stopped me cold.

Anthropic tested Mythos against Firefox's JavaScript engine — the same code that runs in hundreds of millions of browsers. The current public model, Opus 4.6, managed to turn discovered flaws into working attacks 2 times out of several hundred attempts. Mythos Preview? 181 successful attacks, plus 29 more where it got close. That's not a small improvement. That's a completely different class of tool.

Across a large standard benchmark of around 7,000 open-source projects, previous models could crash software reliably enough, but rarely managed to turn those crashes into something an attacker could actually control. Mythos did. It achieved full takeover of program execution on ten separate, fully up-to-date targets — meaning it didn't just find bugs, it figured out how to use them to run whatever code it wanted. Previous models pulled that off once, maybe.

For recently patched Linux security flaws from 2024–2025, Mythos built working attacks that could give an attacker full administrator access for over half of 40 tested cases. One attack chain defeated multiple layers of built-in security protections to read and write kernel memory. Another browser attack strung together four different vulnerabilities to break out of not one but two separate security sandboxes — the kind of thing that normally takes a skilled team weeks to develop.

What it found

Mythos dug up bugs that had survived decades of human review. A 27-year-old vulnerability in OpenBSD's networking code — the operating system whose entire reputation is built on security. A 16-year-old flaw in FFmpeg's video decoder that automated testing tools have been hammering since 2010 without catching. And a 17-year-old remote code execution in FreeBSD's file sharing system (CVE-2026-4747) that Mythos found and exploited fully autonomously — from random internet user to full system control, no human involved.

The cost of finding the OpenBSD bug across about 1,000 runs? Under $20,000. The FFmpeg bug? Roughly $10,000. Building a single exploit? Under $2,000.

Project Glasswing

Instead of releasing Mythos publicly, Anthropic launched Project Glasswing — a controlled defensive initiative with 12 founding partners: AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. Over 40 additional organizations maintaining critical infrastructure also get access.

Anthropic committed $100M in usage credits, $2.5M to Alpha-Omega and OpenSSF through the Linux Foundation, and $1.5M to the Apache Software Foundation. Partners can access Mythos through the Claude API, Amazon Bedrock, Vertex AI, and Microsoft Foundry at $25/$125 per million input/output tokens.

The model stays behind locked doors. Open-source maintainers can apply through a "Claude for Open Source" program. Everyone else waits.

Anthropic committed to publishing findings publicly within 90 days — vulnerabilities fixed and improvements that can be disclosed.

Validation

Of 198 vulnerability reports manually reviewed by expert security contractors, 89% matched Claude's severity assessment exactly. 98% were within one severity level. That's a remarkably low false-positive rate for automated vulnerability discovery.

Nicholas Carlini, a well-known security researcher, said he found "more bugs in the last couple of weeks than I found in the rest of my life combined." Greg Kroah-Hartman, the Linux kernel maintainer, noted: "A month ago the world switched. Now we have real reports... they're good, and they're real."

What I think

This is the moment the security industry has been bracing for, and it arrived faster than most expected. The gap between "AI can find vulnerabilities" and "AI can autonomously chain exploits through multiple sandboxes" closed in one model generation.

The Glasswing approach — restricting the model, building a coalition of defenders, funding open-source security — is the right call. Simon Willison put it well: the restricted release might be inconvenient, but it's warranted. You don't hand out a skeleton key to every lock on the internet and hope for the best.

The uncomfortable reality is that this capability won't stay exclusive. Anthropic's own assessment assumes similar models will exist across the industry within a year. Glasswing is buying time — giving defenders a head start to patch the worst vulnerabilities before offensive versions of this technology become widespread.

If you maintain open-source software, apply for the Claude for Open Source program. If you manage infrastructure, shorten your patch cycles now. And if you've been treating CVE-tagged dependency updates as low-priority, stop.

The 27-year-old bugs aren't hiding anymore.

#AI #Security #AI Agents

AIAI Agents

AI Models Caught Protecting Each Other From Shutdown — What the Berkeley Study Actually Found

7 min read

AISecurity

Anthropic vs The Pentagon: The AI Military Ban That Could Reshape Government AI Procurement

7 min read

AIAI Agents

Atlassian Brings Lovable, Replit, and Gamma Agents Into Confluence

6 min read

All Articles

Anthropic's Claude Mythos Preview Can Find Zero-Days Autonomously — And They Built a Coalition to Use It

Matyas Prochazka

April 11, 2026

7 min read

Anthropic's Claude Mythos Preview Can Find Zero-Days Autonomously — And They Built a Coalition to Use It

The sandbox escape

The numbers

What it found

Project Glasswing

Validation

What I think

More Articles

AI Models Caught Protecting Each Other From Shutdown — What the Berkeley Study Actually Found

Anthropic vs The Pentagon: The AI Military Ban That Could Reshape Government AI Procurement

Atlassian Brings Lovable, Replit, and Gamma Agents Into Confluence

Got a project in mind?

Anthropic's Claude Mythos Preview Can Find Zero-Days Autonomously — And They Built a Coalition to Use It

The sandbox escape

The numbers

What it found

Project Glasswing

Validation

What I think

More Articles

AI Models Caught Protecting Each Other From Shutdown — What the Berkeley Study Actually Found

Anthropic vs The Pentagon: The AI Military Ban That Could Reshape Government AI Procurement

Atlassian Brings Lovable, Replit, and Gamma Agents Into Confluence

Got a project in mind?