Anthropic withholds powerful new AI model over cyber risks

April 8, 2026

Anthropic withholds powerful new AI model over cyber risks.

Anthropic announced Project Glasswing on Tuesday, a defensive effort to secure critical software using its unreleased frontier model, Claude Mythos Preview.

The company said the model identified thousands of zero-day vulnerabilities in major operating systems, web browsers, and other foundational code. Many flaws had gone undetected for years, including a 27-year-old bug in OpenBSD.

Examples cited include finding the OpenBSD issue for around $50 and building full remote code execution exploits on systems like FreeBSD for under $1,000. Anthropic noted these tasks previously demanded weeks of work by top human experts. Engineers without security backgrounds reportedly obtained working exploits overnight by directing the model.

Anthropic formed Project Glasswing with partners including Amazon, Apple, Google, Microsoft, NVIDIA, Broadcom, Cisco, CrowdStrike, JPMorganChase, Palo Alto Networks, and the Linux Foundation. The group committed up to $100 million in model usage credits, plus donations to open-source security projects.

The initiative focuses on using the model to scan and patch vulnerabilities before wider access to similar AI capabilities spreads. Anthropic stated it has no plans for general public release of Mythos Preview at this time, citing the need for stronger safeguards against misuse.

The company described the model as a general-purpose system with strong agentic coding and reasoning skills that emerged during training. It has shared some vulnerability details with affected maintainers, and patches for tested cases are already in place.

This move highlights growing concerns that advancing AI could shift the balance in cybersecurity, enabling faster discovery of flaws by both defenders and potential attackers. Project Glasswing aims to give key infrastructure a head start on repairs.

LEAVE A REPLY Cancel reply