Project Glasswing Goes Wide: 150 More Organizations, 10,000+ Flaws, and the AI Security Audit the World Depends On

Table of Contents

When Anthropic launched Project Glasswing in April, the program looked like a controlled experiment. Nine organizations — AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, Microsoft, NVIDIA — received restricted access to Claude Mythos to hunt vulnerabilities in their own infrastructure. The framing was cautious: a narrow pilot with a short list of companies large enough to absorb and act on what Mythos found.

On June 2, Anthropic announced the program is scaling. 150 additional organizations across 15+ countries now have access. New sectors added: power generation and distribution, municipal water systems, healthcare networks, communications infrastructure, and semiconductor hardware. The criterion for admission is structural: these are operators of systems where “a major attack could affect more than 100 million people.”

The numbers have moved accordingly. Glasswing partners have now collectively identified 10,000+ high and critical-severity security vulnerabilities using Mythos — roughly triple the initial “thousands of zero-days” figure from April. Responsible disclosure pipelines are running at a scale the security industry has never processed before.

What 10,000 Flaws Actually Means
#

The security industry’s CVE system published approximately 40,000 vulnerability disclosures in all of 2024, across every reported vendor and researcher. Glasswing partners, operating under NDA with a single AI model, have surfaced 10,000+ high-severity findings in two months. Across nine organizations in the original cohort. At an average severity that, by Anthropic’s own characterization, skews heavily toward critical infrastructure exposure.

That ratio is hard to interpret without knowing the base rate — how many of these were previously known, how many are genuinely novel, how many will produce public CVEs with Glasswing attribution. But the scale suggests something structural: Mythos is not doing what a skilled human security researcher does. It is doing what a skilled security research team of 500 people would do, running continuously, without context resets.

The 27-year-old OpenBSD bug found in April was the signature example of what that looks like. Long-lived codebases have had their attack surface analyzed continuously for years; Mythos found something that had survived that analysis because it required holding a broad architectural context while recognizing a narrow class of exploitation condition. That is the kind of work that accumulates into a 10,000-flaw dataset across just nine organizations.

With 150 new organizations across power, water, and healthcare — domains with significantly older and less-audited codebases than the original tech-company cohort — the pace of discovery is likely to accelerate, not slow.

Who the New Partners Are
#

Anthropic has not published the full list. The sector expansion tells you more than company names would. Power grid operators and water utilities typically run SCADA systems and industrial control equipment that is decades old, rarely patched, and increasingly internet-adjacent. Healthcare networks run electronic health record systems that aggregate protected data at scale. Communications infrastructure — the physical backbone of internet routing — has known vulnerabilities that go unpatched because the operational disruption of patching exceeds the perceived risk.

These systems share a property: they are high-consequence but chronically under-resourced for security. Most water utilities don’t have a red team. Power grid operators have compliance-driven security programs, not offensive research programs. The patch cycles are measured in years.

Glasswing gives these organizations access to a capability that, until April 2026, didn’t exist at any price: a fully autonomous system that can audit an entire attack surface at speed and produce actionable findings with remediation guidance. The gap between what Mythos finds and what these organizations could find on their own is not measured in months of staffing. It is measured in capabilities that don’t exist in their budgets.

The EU ENISA Addition
#

One day before the 150-org expansion, Anthropic announced Glasswing access for ENISA — the EU Agency for Cybersecurity. ENISA is the coordinating body for cybersecurity across the European Union’s 27 member states. Its inclusion signals that Anthropic is treating Glasswing as a policy instrument as well as a product.

That framing matters. A private company handing a restricted model to a governmental cybersecurity regulator is an unusual move in an industry that typically resists regulatory access to proprietary systems. For Anthropic, ENISA access is likely both a genuine commitment to collaborative defense and a strategic bet: being seen as a willing partner to regulators before the EU AI Act’s enforcement phase adds up, especially for a company that just filed an S-1.

The Two-Tier Architecture
#

An important distinction: Mythos under Project Glasswing and Claude Security (the commercial product) are not the same thing.

Claude Security, launched in public beta in May and powered by Claude Opus 4.8, is what enterprises can actually buy. It provides reasoning-based vulnerability scanning — not pattern-matching against CVE databases, but genuine contextual analysis of whether a code path is exploitable. The launch partners include CrowdStrike, Palo Alto Networks, SentinelOne, Wiz, and TrendAI. Pricing is usage-based; integration is via API or direct agent workflow. Covered in detail when it launched.

Mythos is the model above Opus 4.8. Access requires a bilateral arrangement with Anthropic. There is no sign-up form, no pricing page, no API endpoint. The qualification criterion — that your infrastructure, if compromised, could affect more than 100 million people — is not something most enterprises meet.

That two-tier structure is deliberate. Claude Security is the product that any organization with a security team can use. Glasswing is the program for when the stakes of a breach are catastrophic at a societal scale. The architecture matches the risk profile.

When Does Mythos Become Available for Everyone Else?
#

The honest answer remains: not soon.

Anthropic’s “coming weeks” language about Mythos GA from the May 28 Opus 4.8 announcement described general availability for all customers. That framing is still active. But the expansion of Glasswing suggests that Mythos’s primary near-term deployment path is expansion of the controlled-access program, not a broad API rollout.

The pattern at Anthropic has been: restricted access → research preview → general availability, with the cadence tied to safety confidence rather than commercial demand. Computer Use went through a multi-month preview before GA. Claude Security launched as a public beta before GA. Glasswing’s expansion from 9 to 150+ organizations is the “expand restricted access” step on that ladder.

What the 10,000-flaw dataset produces in terms of safety confidence is harder to predict. Finding thousands of critical vulnerabilities in friendly infrastructure, under controlled conditions, with responsible disclosure pipelines, is actually the ideal environment for testing whether Mythos produces findings that are accurate, actionable, and not prone to hallucinated exploitation paths. That is the safety work. When Anthropic is confident in that quality bar at scale, the next step becomes plausible.

For Claude Code users waiting for Mythos capabilities in their agentic coding workflows: the coding benchmarks still haven’t been officially published. The April announcement focused entirely on security capability. The Opus 4.8 release notes referenced a “coming weeks” GA window. The June 2 Glasswing expansion reinforces that Anthropic is managing the release carefully. July or August GA for general customers remains a plausible read. Announce-day coverage is warranted.

The Template Question
#

Project Glasswing is an answer to a question the AI industry had not yet decided how to pose: what do you do when a model is capable enough to be genuinely dangerous at scale, and you’ve already built it?

Anthropic’s answer was to deploy it defensively, under restriction, to the organizations whose security most affects everyone else. Two months in, the program has found more vulnerabilities in critical infrastructure than the entire CVE ecosystem typically processes in a quarter. Expanding to 150 new organizations means the experiment has moved into generalization — the question is no longer whether Mythos works on a curated set of tech giants, but whether the model holds up across the full breadth of critical infrastructure security posture.

Other labs will build comparable capabilities. When they do, Project Glasswing will be the reference architecture for how to deploy them.

Sources: Bloomberg — Project Glasswing expansion · CNBC — Anthropic Mythos · Anthropic — Expanding Project Glasswing · Prior coverage: Project Glasswing launch · Claude Security launch

What 10,000 Flaws Actually Means#

Who the New Partners Are#

The EU ENISA Addition#

The Two-Tier Architecture#

When Does Mythos Become Available for Everyone Else?#

The Template Question#

Related