AI achieves new benchmark for autonomous cyber capability.

Two leading artificial intelligence models, Anthropic’s Claude Mythos Preview and OpenAI’s GPT-5.5, have demonstrated an unprecedented leap in autonomous cybersecurity capabilities, according to new findings. This acceleration significantly outpaces the doubling trend previously observed in AI’s ability to perform complex cyber tasks.

The United Kingdom’s AI Security Institute (AISI) and cybersecurity firm Palo Alto Networks released separate reports Wednesday detailing these advancements. The AISI’s evaluation indicates that these frontier AI models have surpassed previous projections for autonomous cyber task completion speed, suggesting a potential shift in the trajectory of AI development in this critical domain.

Rapid Advancement in Autonomous Cybersecurity

The AISI has been meticulously tracking the progress of AI models in completing cybersecurity tasks autonomously. Previously, the time it took for these models to reliably complete complex cyber operations was estimated to be doubling every five months, a substantial acceleration from an earlier eight-month doubling period. However, the latest assessments of Claude Mythos Preview and GPT-5.5 reveal performance levels that significantly exceed these established trends.

“Frontier AI’s autonomous cyber and software capability is advancing quickly: the length of cyber tasks that frontier models can complete autonomously has doubled on the order of months, not years,” the AISI stated in its findings. This statement underscores the profound speed at which AI systems are evolving within the cybersecurity landscape.

Key Performance Milestones Achieved

The AISI’s structured simulations, conducted in advanced cyber ranges, provided clear evidence of the recent capability jump. Claude Mythos Preview became the first AI model to successfully complete both of the institute’s challenging multi-stage attack simulations. These simulations mimic intricate attacks against networked enterprise systems. Specifically, the model solved “The Last Ones,” a 32-step corporate network attack, in 6 out of 10 attempts.

Furthermore, Claude Mythos Preview achieved a 3 out of 10 success rate on “Cooling Tower,” a simulation that had previously remained unsolved by any AI model tested. GPT-5.5 also demonstrated impressive results, solving “The Last Ones” in 3 out of 10 attempts, reinforcing the widespread advancement in AI-driven cyber defense and offense.

Palo Alto Networks Corroborates Findings

Palo Alto Networks’ independent testing has corroborated the AISI’s conclusions regarding the escalating proficiency of advanced AI models. The company, which has been testing Claude Mythos since April as part of Anthropic’s Project Glasswing, also evaluated Claude Opus 4.7 and OpenAI’s GPT-5.5-Cyber. These tests were conducted as part of OpenAI’s Trusted Access for Cyber program.

“The latest models are extraordinarily capable at finding vulnerabilities and changing them into critical exploit paths in near-real-time,” Palo Alto Networks reported. This capability has led to the identification of a significant number of security issues. Palo Alto Networks released advisories for 26 CVEs, detailing 75 unique issues discovered through AI model scanning across over 130 products, a volume far exceeding typical monthly findings.

Implications for Cybersecurity Enterprises

The rapid progress in autonomous AI for cybersecurity presents both opportunities and challenges for enterprises. Palo Alto Networks has outlined four immediate strategic priorities for businesses. The first is to proactively identify and remediate vulnerabilities in code and applications before they can be exploited by malicious actors. The second involves reducing the attack surface by leveraging AI to detect security misconfigurations.

Additionally, companies are advised to deploy robust detection and response tools across all systems, utilizing machine learning for real-time threat identification. The fourth priority emphasizes the need to accelerate security operations response times to minutes, anticipating the potential for AI-powered attacks to unfold with unprecedented speed. This rapid evolution highlights the necessity for organizations to adapt their security postures swiftly.

Future Outlook and Uncertainties

The AISI acknowledges the limitations of its current data, noting that the evaluations are based on a limited number of models and that the most challenging tasks have fewer human comparison data points. Despite these caveats, the institute maintains that the overall trend of rapid growth remains consistent across various models, methodologies, and independent data sources. Separate research from METR, a nonprofit that monitors AI’s handling of software tasks, has reported a similar trend, indicating a doubling time of approximately four months.

The AISI is actively developing more rigorous evaluations, including new cyber ranges and the incorporation of active cyber defenses, to better reflect the evolving complexities of real-world cybersecurity threats. The next key step will be the AISI’s continued monitoring and reporting on further AI advancements, with its next comprehensive update anticipated in the coming months. The broader implications for national security and critical infrastructure defenses remain a significant area of ongoing observation.

Trending

Eclipse Incident Highlights Ongoing Researcher-Vendor Disputes

Hackers Exploit Critical Vulnerability in Everest Forms Pro WordPress Plugin

Final Layer Remains

Rapid Advancement in Autonomous Cybersecurity

Key Performance Milestones Achieved

Palo Alto Networks Corroborates Findings

Implications for Cybersecurity Enterprises

Future Outlook and Uncertainties

Eclipse Incident Highlights Ongoing Researcher-Vendor Disputes

AI Agent Poses Insider Threat

Palo Alto Networks vulnerability exploit revealed.

Zapier addresses bug chain that researchers linked to widespread account takeover risk

Apple Releases Quantum-Resistant Encryption Code

FBI warns of rapidly growing phishing kit targeting Microsoft 365 users

Hackers Exploit Critical Vulnerability in Everest Forms Pro WordPress Plugin

Final Layer Remains

Cisco Addresses Vulnerability in Unified Communications Manager Following Publication of Exploit Code

AI Agent Poses Insider Threat

Cisco Addresses Vulnerability in Unified Communications Manager Following Publication of Exploit Code

AI Agent Poses Insider Threat

GitHub Action Vulnerability Affects Claude Code Repositories

Trending

AI achieves new benchmark for autonomous cyber capability.

Rapid Advancement in Autonomous Cybersecurity

Key Performance Milestones Achieved

Palo Alto Networks Corroborates Findings

Implications for Cybersecurity Enterprises

Future Outlook and Uncertainties

Keep Reading