Claude Mythos AI breaches weak networks, scoring 93% practitioner, 73% expert in cybersecurity.

Editorial illustration for Claude Mythos breaches weak networks, scores 93% practitioner, 73% expert

Claude Mythos: AI Breaches Enterprise Networks Autonomously

Claude Mythos breaches weak networks, scores 93% practitioner, 73% expert

April 14, 2026 • 2 min read

Enterprise security teams have long worried about AI tools that can navigate poorly defended networks without human guidance. Claude Mythos, Anthropic’s latest model, claims to do just that—moving from initial foothold to full compromise in a single, autonomous run. The test suite used to gauge its abilities separates everyday operational scenarios from the kind of deep‑dive analysis that only seasoned experts typically handle.

Results that cross the ninety‑percent mark on the former and clear the seventy‑percent threshold on the latter suggest a shift in what automated agents can achieve. What makes the expert‑level figure stand out is the benchmark set by the AI Security Institute: until April 2025, no system had cracked those higher‑tier challenges. The model’s performance also hinges on a substantial compute allocation—roughly fifty million tokens—raising questions about the resources required to replicate such outcomes.

As organizations scramble to shore up defenses, these numbers provide a concrete reference point for what an AI‑driven adversary might look like in practice.

With a larger compute budget (50 million tokens), Mythos Preview scores around 93 percent on practitioner tasks and 73 percent on expert-level challenges. That expert-level number is particularly notable: according to AISI, no model could solve expert-level tasks before April 2025. Anthropic's Claude Mythos can autonomously hack corporate networks CTF challenges only test individual skills in isolation, but real cyberattacks require chaining dozens of steps across multiple hosts and network segments, the AISI says. To measure that kind of complexity, the institute developed a simulation called "The Last Ones" (TLO): a 32-step attack against a simulated corporate network, from initial reconnaissance to full network takeover.

Claude Mythos can autonomously compromise weakly defended enterprise networks end-to-end - THE DECODER

Claude Mythos has now demonstrated the ability to run a full 32‑step attack on a simulated corporate network, taking control of the environment in three minutes. The British AI Security Institute reports a 73 percent success rate on expert‑level capture‑the‑flag challenges and about 93 percent on practitioner tasks when the model is given a larger compute budget of 50 million tokens. Those numbers are the highest recorded for any AI model up to April 2025, according to the institute.

Yet the tests were conducted on a simulated network, not on live enterprise infrastructure. It is unclear how the model would fare against hardened defenses, multi‑factor authentication, or real‑time monitoring. The evaluation also leaves open questions about the scalability of the approach beyond the preview version.

While the results suggest that AI can automate parts of a cyber‑attack chain, the practical implications for security teams remain to be clarified. Further independent testing will be needed to gauge the true risk posed by such capabilities.

Common Questions Answered

How does Claude Mythos perform on cybersecurity challenges across different skill levels?

Claude Mythos demonstrates impressive performance with a 93% success rate on practitioner-level tasks and a 73% success rate on expert-level cybersecurity challenges. These scores are particularly significant, as the British AI Security Institute notes that no previous model could solve expert-level tasks before April 2025.

What specific network penetration capabilities has Claude Mythos demonstrated?

Claude Mythos has proven its ability to autonomously hack corporate networks by successfully executing a full 32-step attack on a simulated corporate environment in just three minutes. The model can chain multiple complex steps together, moving from an initial network foothold to complete system compromise without human intervention.

What compute resources are required for Claude Mythos to achieve its high performance?

Claude Mythos achieves its impressive cybersecurity performance with a larger compute budget of 50 million tokens. This increased computational resource allows the model to tackle more complex challenges and demonstrate higher success rates across both practitioner and expert-level security tasks.

🎓

Featured Review

No Code MBA

Build AI apps without coding. Our in-depth course review.

Read Review

Claude Mythos: AI Breaches Enterprise Networks Autonomously

Further Reading

Common Questions Answered

How does Claude Mythos perform on cybersecurity challenges across different skill levels?

What specific network penetration capabilities has Claude Mythos demonstrated?

What compute resources are required for Claude Mythos to achieve its high performance?

Most Popular

Developers Claim Measured Drop in Claude's Performance, Sparking Nerf Debate

Intuit turns months of tax code work into hours with proprietary DSL

Two new AI sandbox architectures limit credential exposure after prompt injection

Google Vids adds Veo, Lyria AI models and directable avatars for flyers, reels

Alibaba’s Tongyi Lab launches VimRAG, a memory‑graph multimodal RAG framework

Guide to Building Document Intelligence Pipelines with LangExtract and OpenAI

Meta's structured prompting lifts LLM code review accuracy to 93%

Nvidia unveils Agentforce AI platform with Adobe, Salesforce, SAP at GTC 2026

Sam Altman proposes new AI 'social contract' in You.com guide

Anthropic ends free OpenClaw access to Claude, adds extra fee April 4

Further Reading

Related Reading

Ant Group unveils Ring-1T, first open-source trillion-parameter reasoning model

Google adds “Skills” to Chrome, enabling one‑click reuse of Gemini prompts

Gen AI app sessions up fivefold, downloads jump 778% as ChatGPT leads traffic

Google launches AI chips with 4× boost, lands Anthropic multibillion deal

Anthropic finds strict anti-hacking prompts increase AI sabotage and lying

Google AI's Vantage protocol shows Executive LLM beats agents on 8 metrics

LPM 1.0 creates 45‑minute lip‑synced video from a single photo in real time

Anthropic rolls out Claude Managed Agents, a one‑stop shop with lock‑in risk

Claude Mythos highlights EU AI safety gaps, says researcher Caroli

Common Questions Answered

How does Claude Mythos perform on cybersecurity challenges across different skill levels?

What specific network penetration capabilities has Claude Mythos demonstrated?

What compute resources are required for Claude Mythos to achieve its high performance?

Most Popular

Developers Claim Measured Drop in Claude's Performance, Sparking Nerf Debate

Intuit turns months of tax code work into hours with proprietary DSL

Two new AI sandbox architectures limit credential exposure after prompt injection

Google Vids adds Veo, Lyria AI models and directable avatars for flyers, reels

Alibaba’s Tongyi Lab launches VimRAG, a memory‑graph multimodal RAG framework

Guide to Building Document Intelligence Pipelines with LangExtract and OpenAI

Meta's structured prompting lifts LLM code review accuracy to 93%

Nvidia unveils Agentforce AI platform with Adobe, Salesforce, SAP at GTC 2026

Sam Altman proposes new AI 'social contract' in You.com guide

Anthropic ends free OpenClaw access to Claude, adds extra fee April 4