Corporate policyGlobal2026-04-19

Anthropic Restricts Deployment of Claude Mythos Preview Citing Offensive Cybersecurity Capabilities

Anthropic

Anthropic has announced Claude Mythos Preview, a frontier general-purpose language model that demonstrates advanced performance on cybersecurity tasks including vulnerability discovery and exploitation. The model's capabilities prompted Anthropic to restrict its deployment and launch Project Glasswing, an initiative aimed at securing critical software in anticipation of advanced AI-enabled threats. Due to benchmark saturation, Anthropic supplemented standard evaluations with novel real-world security tasks to assess the model's risk profile before any broader release. The decision to limit availability based on cybersecurity risk signals a maturing approach to capability-based deployment controls at the frontier model level. For enterprise compliance and risk teams, the case illustrates how AI developers may unilaterally restrict access to high-capability models and establishes a precedent for treating offensive security capability as a gating criterion for deployment. Organizations relying on Anthropic models or building AI-enabled security tooling should monitor access policy changes and assess supply chain dependencies accordingly.

cybersecurityfrontier-modelsdeployment-restrictionsred-teamingAI-risk-management

Source →

Related directory entries

Bletchley Declaration on AI Safety →NIST AI 600-1 Generative AI Profile →Singapore Consensus on Global AI Safety Research Priorities →