The Anthropic Mythos Leak: When "Too Dangerous to Release" AI Gets Leaked Anyway

There’s a certain irony to building the most powerful cybersecurity AI in the world and then watching it leak through a vulnerability in your own third-party vendors. But that’s exactly what happened to Anthropic this month, and the implications go way beyond embarrassment.

Let me walk you through what actually happened, why it matters more than you might think, and what this tells us about the future of AI security because this story is a lot more interesting than the headlines suggest.

The Model That Was Never Supposed to Escape

Back in early April 2026, Anthropic announced something extraordinary: Claude Mythos Preview, an AI model so powerful at finding and exploiting security vulnerabilities that the company decided it was too dangerous for public release.

And they weren’t exaggerating for effect. According to Anthropic’s own testing, Mythos can autonomously discover and exploit zero-day vulnerabilities in every major operating system and web browser. We’re talking about the kind of capabilities that would make nation-state hackers salivate.

The model found a 27-year-old vulnerability in OpenBSD, an operating system literally famous for its security. It wrote browser exploits that chained together four different vulnerabilities, complete with complex JIT heap sprays that escaped both renderer and OS sandboxes. Mozilla used it to identify and patch 271 vulnerabilities in Firefox in a matter of weeks.

This isn’t your standard AI chatbot that sometimes gets facts wrong. This is a tool that can, in Anthropic’s words, “enable dangerous cyberattacks” if it falls into the wrong hands.

So naturally, Anthropic was extremely careful about who could access it. They launched something called Project Glasswing a highly restricted program that gave access to exactly 40 handpicked organizations. The list read like a who’s who of tech and finance: Amazon, Apple, Google, Microsoft, Nvidia, JP Morgan Chase, Goldman Sachs, Citigroup, Bank of America, Morgan Stanley.

The idea was simple: give these companies early access so they could use Mythos to find and fix vulnerabilities in their own systems before releasing it more broadly. Defensive security, not offensive attacks.

Great plan. Except for one small problem.

The Leak That Happened on Day One

On April 7, 2026, Anthropic publicly announced Mythos and Project Glasswing. That same day literally the same day a small group of unauthorized users gained access to the model.

And they didn’t do it through some sophisticated hack involving quantum computers and zero-day exploits. They essentially guessed the URL.

Here’s how it went down, based on Bloomberg’s reporting: A handful of users in a private Discord channel dedicated to finding unreleased AI models combined a few key pieces of information. One member of the group worked for a third-party contractor that had access to Anthropic’s systems. Using that connection, along with knowledge of how Anthropic typically formats URLs for its models (information that had previously leaked from AI training startup Mercor), the group made an educated guess about where Mythos might be hosted.

And they were right.

Since gaining access, the group has been using Mythos regularly. They provided Bloomberg with screenshots and even a live demonstration to prove it. As of the latest reports, they still have access.

To be crystal clear: there’s no evidence this group is using Mythos for malicious purposes. According to people familiar with the situation, they’re enthusiasts who just want to try unreleased models. But that’s sort of missing the point.

Why “We’re Not Using It Maliciously” Doesn’t Really Help

Let me share something that cybersecurity expert David Lindner told Fortune, and it’s stuck with me: “If some group some random Discord online forum, got access to it, it’s already been breached by China.”

That might sound alarmist, but think about it logically. If a Discord group of AI enthusiasts could access it through a third-party vendor on launch day, what do you think sophisticated state-sponsored actors with actual resources could do?

The uncomfortable truth is that once you’ve demonstrated a path to access even if you’re not exploiting it you’ve essentially published a roadmap for everyone else. Security through obscurity only works if things actually stay obscure.

And here’s another angle that nobody seems to be talking about enough: intention can change. Today’s curious hobbyist who just wants to experiment with cool AI models could be tomorrow’s financially motivated individual who realizes they’re sitting on something valuable. Or their Discord account gets compromised. Or they mention it to the wrong person at a conference after a few drinks.

The point isn’t that these specific people are threats. It’s that the model being accessible outside of Anthropic’s controlled environment fundamentally changes the security equation.

The “It Was Bound to Happen” Problem

When David Lindner said “it was bound to happen,” he was pointing to something that deserves more attention: the paradox of exclusive access.

Anthropic limited Mythos to 40 companies to keep it secure. But here’s the thing those 40 companies employ hundreds of thousands of people collectively. And many of those people would need some level of access to actually use Mythos for its intended purpose of finding vulnerabilities in their systems.

Even if we assume each company only gave access to 50 people (which is probably conservative for organizations the size of Microsoft or Google), that’s already 2,000 people who can potentially touch the model. And that doesn’t count the third-party vendors, contractors, and service providers that inevitably get involved when enterprises deploy new technology.

In security, we have this principle: your attack surface is determined by your largest exposure, not your average one. It doesn’t matter if 39 companies have perfect security if the 40th company’s vendor has weak access controls.

What Actually Leaked (Besides the Model)

Here’s where the story gets even more interesting. Before the unauthorized access, there was an earlier leak that revealed Mythos’s existence in the first place.

Back in late March, security researchers Roy Paz and Alexandre Pauwels discovered that Anthropic had accidentally left close to 3,000 unpublished documents publicly accessible through a misconfigured content management system. Among those documents was a draft blog post detailing Mythos (initially called “Capybara” internally) and its capabilities.

That leak was pure human error someone forgot to set the correct permissions in their CMS, and suddenly all these draft documents were searchable and retrievable by anyone who knew where to look.

The draft described Capybara/Mythos as representing a completely new tier of model, sitting above even Anthropic’s flagship Opus line. “Larger and more intelligent than our Opus models which were, until now, our most powerful,” the document stated.

It also detailed the model’s unprecedented cybersecurity capabilities, essentially providing a preview of exactly why this model would be so dangerous if it leaked.

Which brings us to a somewhat dark irony: the document explaining why Mythos needed to be kept secret leaked before the model itself did.

The Technical Capabilities That Make This Scary

Let me explain why this isn’t just another “AI can write code” story, because the technical capabilities here are genuinely remarkable and concerning.

Mythos doesn’t just find simple bugs. According to Anthropic’s technical documentation (which is now public on their red teaming site), the model can:

Find ancient vulnerabilities: That 27-year-old OpenBSD bug wasn’t sitting there waiting to be discovered with a simple code review. It was subtle enough that it had survived decades of security-focused development.

Chain exploits: When Mythos found four separate vulnerabilities and chained them together to escape browser sandboxes, that’s not script-kiddie stuff. That’s sophisticated exploitation that typically requires deep systems knowledge and creativity.

Autonomously escalate privileges: The model can identify subtle race conditions and KASLR bypasses to achieve local privilege escalation on Linux. If you know what those terms mean, you understand why that’s impressive. If you don’t, just know that these are the kinds of techniques that separate amateur hackers from professionals.

Reverse-engineer exploits: Mythos can take closed-source software and figure out how to exploit it, even without source code access.

The key word in all of this is “autonomously.” This isn’t an AI assistant helping a human security researcher. This is an AI conducting end-to-end vulnerability research and exploit development on its own.

The Vendor Security Problem That Nobody Wants to Talk About

Anthropic’s official statement is carefully worded: “We’re investigating a report claiming unauthorized access to Claude Mythos Preview through one of our third-party vendor environments.”

Notice what they’re not saying. They’re not saying their systems were breached. They’re saying the access came through a vendor environment.

This distinction matters legally and technically, but from a security outcomes perspective, it’s mostly irrelevant. The model was accessed. The access was unauthorized. The access continues.

Gabrielle Hempel, Security Operations Strategist at Exabeam, nailed the issue: “While everyone seems focused on securing against sophisticated nation-state actors, we’ve increasingly seen third-party access paths becoming the weakest link.”

She continued: “Any time you build a high-capability system and expose it even to a semi-distributed environment (partners, contractors, ‘trusted’ ecosystems), you’re expanding your attack surface beyond what you can realistically control.”

This is the central tension in enterprise software deployment. You can’t be useful without being accessible, but you can’t be accessible without creating vulnerability. And when what you’re distributing is literally a tool designed to find and exploit vulnerabilities, that tension becomes almost impossible to manage.

The Political Dimension You’re Not Hearing About

Here’s a wrinkle that makes this even messier: the leak happened one day after President Trump said on CNBC that a Pentagon deal with Anthropic was “possible” and that the company was “shaping up.”

Meanwhile, Anthropic is actively suing the Department of Defense over being blacklisted as a supply chain risk. The core of their legal argument? That they apply rigorous safety and access controls to their most capable models.

You can see the problem.

If you’re Anthropic’s lawyers, you just watched your main argument get significantly weaker. If you’re Pentagon officials skeptical of Anthropic’s security posture, you just got ammunition for your position. If you’re a lawmaker worried about AI safety and control, you just got a real-world example of why concerns about “dual-use” AI capabilities aren’t theoretical.

The timing couldn’t have been worse for Anthropic’s government relations efforts.

What This Means for AI Security Going Forward

Let’s zoom out for a second and talk about what this incident reveals about the future of AI security, because it’s not encouraging.

The time compression is real. From public announcement to unauthorized access: zero days. There’s no grace period anymore for companies to get security right. The window between “we built something powerful” and “someone unauthorized has it” is shrinking toward zero.

Insider threats and vendor security are the new perimeter. Traditional security focused on keeping bad guys outside your network. But when your contractors can enable access through educated URL guessing, the concept of a security perimeter starts to feel quaint.

Restricting access to small groups doesn’t scale. Project Glasswing’s model limit it to 40 trusted organizations sounds reasonable until you realize those 40 organizations have their own vendors, contractors, and employees. Trust isn’t transitive in the way this model assumes.

Intent doesn’t matter for dual-use capabilities. The Discord group might not be malicious, but Mythos doesn’t care about their intentions. A tool this powerful is dangerous regardless of who holds it or why.

The OpenAI Controversy You Should Know About

Before we go further, I need to mention something that’s been bubbling in the background: OpenAI’s Sam Altman called Anthropic’s Mythos promotion “fear-based marketing.”

His argument, essentially, is that Anthropic is overhyping the danger to generate buzz and position themselves as the “responsible” AI company.

And look, there’s probably some truth to the marketing angle. Anthropic has been very public about their safety-first approach, and making a big deal about Mythos being too dangerous for public release certainly reinforces that brand position.

But here’s the thing: even if Altman’s right about the marketing motivation, the capabilities appear to be real. Mozilla didn’t find 271 vulnerabilities because of hype. OpenBSD’s 27-year-old bug wasn’t discovered through fear-based marketing.

Whether Anthropic is being appropriately cautious or opportunistically dramatic is kind of secondary to the question of whether the capabilities themselves are concerning. And the evidence suggests they are.

What Businesses Should Actually Do About This

If you’re running a business, particularly one in tech or finance, here’s what this story means for you practically:

Vendor security audits just became more critical. The Mythos leak happened through a third-party vendor. How well do you actually know the security practices of your vendors? Do you have audit rights? Do you exercise them?

Assume models like this are in the wild. Whether it’s Mythos specifically or other sophisticated AI security tools, you should operate under the assumption that attackers have access to AI-powered vulnerability discovery. That changes your defensive timeline. Patching can’t wait for the next maintenance window.

Consider using Mythos defensively if you can. If you’re one of the organizations with legitimate access through Project Glasswing or can get it, use it to find your vulnerabilities before someone else does. The race is on.

Third-party code review becomes more important. If AI can autonomously find vulnerabilities in your code, the static application security testing (SAST) tools you’ve been using might not be enough anymore. Consider more comprehensive audits, especially of critical systems.

The Bigger Picture: What Happens When AI Gets Really Good at Hacking?

Let’s talk about the elephant in the room that nobody wants to address directly: we’re approaching a world where AI models can autonomously discover and exploit vulnerabilities faster than humans can patch them.

Mythos represents a step change in those capabilities. But it won’t be the last step. There will be more powerful models. And eventually, those models will be accessible to more people, through more channels, with even less control.

Traditional security has always been a cat-and-mouse game between attackers and defenders. But it relied on certain fundamental constraints: humans get tired, humans make mistakes, humans have limited time and resources.

AI doesn’t get tired. It can test millions of variations without losing focus. It can work 24/7 across thousands of targets simultaneously. And as these models get better, the advantage tilts increasingly toward offense.

The defensive response needs to evolve. That means:

More automation in defensive security: If attackers have AI, defenders need it too. Manual code review and penetration testing won’t cut it.
Faster patch cycles: The traditional quarterly patch schedule is already obsolete. When AI can find and exploit vulnerabilities in hours, patches need to deploy in hours.
Fundamental architectural changes: We might need to reconsider how we build software entirely. Defense in depth, zero trust, microsegmentation these aren’t optional anymore.
Better vendor security standards: The third-party vendor problem isn’t going away. Industry-wide standards for vendor security need to get a lot more rigorous.

The Questions Nobody Can Answer Yet

Here’s what keeps me up at night about this story: there are huge questions we simply don’t have answers to yet.

How many other unreleased models have been accessed this way? The Discord group apparently focuses on finding unreleased models generally, not just Mythos. What else have they found? What else have other groups found?

Is there really no evidence of malicious use? Anthropic says the access hasn’t extended beyond the vendor environment and they’ve found no evidence of impact to their systems. But absence of evidence isn’t evidence of absence. Could someone have accessed Mythos, used it to find vulnerabilities, and we just haven’t detected it yet?

What happens when this is easy? Right now, accessing Mythos requires some insider knowledge and technical sophistication. But what happens in six months when the techniques are better documented? When the Discord groups are bigger and more organized?

Can this actually be secured? Maybe the uncomfortable truth is that models with these capabilities simply cannot be kept restricted effectively. Maybe the only real option is to accept that they’ll leak and focus entirely on defensive applications.

The Leak That Proves the Point

There’s a beautiful paradox at the heart of this story. Anthropic built Mythos specifically because they believe AI-powered cyberattacks are a serious threat that needs to be addressed proactively. They created Project Glasswing to help organizations defend themselves before these capabilities became widespread.

And then Mythos itself leaked, proving exactly why those concerns were justified.

The leak demonstrates that even with good intentions, careful planning, and restriction to a small group of trusted partners, keeping powerful AI capabilities truly restricted is incredibly difficult.

Which raises an uncomfortable question: if Anthropic a company that thinks more about AI safety than probably any other major AI lab can’t keep their most sensitive model secure, what does that say about the rest of the industry?

What Anthropic Should Do Next (But Probably Won’t)

If I were advising Anthropic, here’s what I’d push for, though I recognize some of this is politically impossible:

Full transparency about the vendor. Name the vendor. Explain exactly what happened. This would burn a business relationship, but it would also help the industry understand and prevent similar incidents.

Technical details on the access method. Beyond “they guessed the URL,” what were the specific failures that allowed this? Publishing a detailed postmortem would be valuable for the security community, even if it’s embarrassing.

Expanded defensive access. If Mythos is in the wild anyway, maybe it’s time to expand Project Glasswing significantly. Get it into the hands of every organization that maintains critical infrastructure or widely-used software. At least level the playing field.

Investment in vendor security standards. Put money and engineering resources toward helping vendors improve their security. This isn’t just Anthropic’s problem it’s an industry-wide issue.

Will they do any of this? Probably not. The legal, competitive, and reputational risks are too high. But it would be the right thing to do.

The Real Lesson: Security Is Harder Than Ever

I’ve been covering technology and cybersecurity for years, and this story represents something I’ve been watching with increasing concern: the gap between our security practices and the threats we face is widening, not closing.

We’re still operating with security models designed for a pre-AI world. Access controls. Perimeter security. Vendor trust frameworks. All of these assume that threats are primarily human, that they scale linearly, and that we have time to respond.

None of those assumptions hold anymore.

The Mythos leak isn’t just about one Discord group getting access to one unreleased model. It’s a preview of what happens when AI capabilities advance faster than our ability to govern, control, and secure them.

And here’s the uncomfortable truth: this is probably going to keep happening. More leaks. More capable models. More sophisticated attacks. More vendor security failures. More zero-days discovered by AI and exploited before patches can deploy.

The question isn’t whether AI will fundamentally change the cybersecurity landscape. It already has. The question is whether we can adapt our defensive practices fast enough to keep up.

Based on the Mythos leak, I’m not sure we can.

The Bottom Line

A small Discord group accessed what Anthropic called “too dangerous to release” on the same day it was announced, using a combination of insider access and educated guessing. They’ve been using it regularly ever since. Anthropic is investigating, but the model remains accessible to unauthorized users.

If that doesn’t concern you, it should.

This isn’t a story about a security failure at one company. It’s a story about the fundamental difficulty of controlling powerful AI capabilities in a world of complex vendor relationships, distributed access requirements, and sophisticated adversaries.

Anthropic did more than most companies would to keep Mythos secure. They limited access to 40 carefully chosen organizations. They built it specifically for defensive purposes. They were transparent about the risks.

And it leaked anyway, on day one.

That’s not an indictment of Anthropic specifically. It’s an indictment of our current approach to AI security generally.

The technology is advancing faster than our ability to secure it. The access requirements for useful deployment conflict with the restrictions needed for safety. The vendor ecosystems we rely on have security practices that range from excellent to terrifying.

And somewhere in all of that complexity, a Discord group is using one of the world’s most powerful hacking tools, while we all hope they’re really just curious enthusiasts and not something worse.

Welcome to the future of AI security. It’s messier than anyone wanted to admit.

ThunDroid

The Anthropic Mythos Leak: When “Too Dangerous to Release” AI Gets Leaked Anyway