Claude Mythos Preview: The AI that finds software flaws, and why you can't have it yet

An AI scored 83.1% finding software bugs, far outperforming others. Here's what that means for your business cybersecurity strategy.

Here’s a number that should make you pause: 83.1%. That’s the percentage of cybersecurity vulnerabilities a new AI model, Claude Mythos Preview, managed to find on a challenging benchmark called CyberGym. For context, the previous top-performing model, Claude Opus 4.6, managed 66.6%. That’s a significant jump, nearly a 17-point leap.

This isn’t just a small improvement. This is the kind of leap that fundamentally changes how we think about digital security. Imagine an tireless intern who can sift through millions of lines of code, looking for weaknesses, and not just finding the obvious ones, but the obscure, deeply buried ones.

What kind of flaws are we talking about? We’re talking about things that have been hiding for years. Mythos Preview autonomously discovered a vulnerability in OpenBSD that had been present for 27 years. It also found a 16-year-old bug in FFmpeg. These aren’t minor glitches; these are architectural weaknesses that could have been exploited by attackers for decades.

This is the core of the problem: software, by its very nature, is complex. Even the most skilled human developers, working with the best tools, can miss things. Codebases grow, teams change, and over time, vulnerabilities creep in. They’re like tiny cracks in a foundation, often invisible until the whole structure starts to shake.

Most people think of AI in cybersecurity as a shield, blocking incoming attacks. They imagine AI systems spotting malicious patterns in network traffic or flagging suspicious emails. And yes, AI is doing that, and doing it well. But what Mythos Preview represents is a shift to AI as a proactive sword, actively hunting for weaknesses before they can be exploited. That shift cuts both ways — the same code-generation models can also introduce new attack vectors, as seen with slopsquatting.

The wrong way to think about this is that Anthropic just built a super-powered vulnerability scanner. That’s like saying a scalpel is just a sharp piece of metal. It misses the precision, the intent, and the potential.

The real story here is about autonomous discovery. Mythos Preview isn’t just checking against a known list of bad patterns. It’s analyzing code structure, understanding logic, and inferring potential exploits based on subtle deviations. It’s learning the language of software flaws.

This is why Anthropic has chosen not to release Mythos Preview publicly, at least not yet. Releasing a tool this powerful without robust safeguards could be incredibly dangerous. Imagine that skilled intern suddenly having access to every company’s digital blueprints. It’s a double-edged sword, and they’re being cautious about how it’s wielded.

Why this matters: The ability of AI to find zero-day vulnerabilities (flaws unknown to vendors and the public) at scale is a profound shift. It means the offensive capabilities of attackers could soon be matched, or even surpassed, by defensive capabilities.

Anthropic is also launching Project Glasswing, a $100 million initiative to bolster defensive cybersecurity efforts. They’re partnering with major players like AWS, Apple, Google, and Microsoft. This suggests a strategy where advanced AI capabilities are being channeled into building stronger digital defenses, rather than simply releasing powerful offensive tools.

So, what does this mean for your business strategy right now?

First, you need to accept that the threat landscape is changing, and changing fast. The idea of “patching regularly” is no longer sufficient. A 27-year-old flaw found by AI means that even well-maintained systems are likely sitting on decades of undiscovered risks.

The typical workflow for dealing with software vulnerabilities looks something like this: a security researcher finds a bug, reports it, the vendor verifies and patches it, and then users apply the patch. With AI like Mythos Preview, that first step – finding the bug – can be done at an unprecedented speed and scale.

The new workflow will likely involve AI identifying potential flaws, human security experts then verifying and prioritizing these findings (because not every AI-found bug is a critical threat), and then the patching process. This requires a significant upgrade in your internal security team’s capabilities and tools.

Consider your current software supply chain. How much third-party code are you running? Each component, from your operating system to your cloud services to your off-the-shelf applications, is a potential entry point. Mythos Preview’s ability to find obscure, long-standing bugs means you can no longer rely solely on vendor assurances or standard security scans.

This AI’s performance on CyberGym is a stark indicator.

Claude Mythos Preview achieved an 83.1% vulnerability reproduction rate on CyberGym, significantly outperforming Claude Opus 4.6 at 66.6% and an older benchmark AI at 30.1%.

This is not a niche problem. This affects every business that relies on software, which is, frankly, every business. If you’re using cloud services, custom applications, or even just standard office software, you are exposed.

The catch is, you can’t just download Mythos Preview and start scanning your own systems. Anthropic is taking a deliberate approach, likely because the potential for misuse is enormous. This is where the strategic partnerships come in. They are working with major tech players to integrate this capability into defensive security platforms.

Your strategy needs to evolve. It needs to include:

Proactive Discovery: Don’t wait for vendors to patch. Explore how advanced AI can be integrated into your security operations to find vulnerabilities within your own infrastructure and software supply chain. This might mean investing in specialized AI security tools or services as they become available.
Human-AI Collaboration: The future isn’t AI replacing security analysts; it’s AI augmenting them. Your team needs to be trained to work with these powerful AI tools, to critically evaluate their findings, and to integrate them into existing incident response processes.
Supply Chain Scrutiny: Understand where your software comes from. If a vendor isn’t transparent about their security practices or doesn’t have a clear plan for integrating advanced AI-driven vulnerability detection, you need to question that relationship.
Investing in Defense: Project Glasswing’s $100 million commitment signals a broader trend. Companies that want to stay ahead will need to invest heavily in defensive cybersecurity technologies, and AI will be at the forefront of that investment.

The landscape of cybersecurity is shifting from a reactive posture to a proactive one, powered by AI. The ability of models like Claude Mythos Preview to find deeply embedded flaws means that the attackers’ toolkit is getting a significant upgrade, and so must yours. This isn’t about fear-mongering; it’s about preparing for a new reality.

Frequently Asked Questions

What is CyberGym? CyberGym is a benchmark designed to test the ability of AI models to identify and reproduce cybersecurity vulnerabilities in software. It’s a challenging environment that mimics real-world coding complexities.

When will Mythos Preview be publicly available? Anthropic has not provided a specific timeline for public release. They are currently focused on integrating its capabilities into defensive cybersecurity platforms through partnerships.

How can my company prepare for AI-driven vulnerability discovery? Start by educating your security teams about AI in cybersecurity. Evaluate your current software supply chain and security tools, and begin exploring vendors or services that are incorporating advanced AI for proactive threat detection.

Claude Mythos Preview: The AI that finds software flaws, and why you can't have it yet

Frequently Asked Questions

Partner with the team.