A large-scale AI-driven security initiative has produced results that have left the security community wide-eyed. Through the project known as Project Glasswing, artificial intelligence was deployed to analyze critical and widely used software — and the findings are remarkable in their scope.

More Than 10,000 Serious Vulnerabilities Exposed

According to Digi.no, the total exceeds 10,000 vulnerabilities classified as high- or critical-severity, spread across popular software that affects major operating systems, among others. The project used Anthropic's AI model Claude Mythos Preview as the core engine of its analysis.

Cloudflare, one of the project's partners, independently reported finding around 2,000 flaws — of which 400 were rated high- or critical-severity. The company also claimed that its false-positive rate was lower than what human testers typically achieve.

10,000+
Total critical vulnerabilities
90.6%
Share confirmed as real threats

Of the 1,752 high- and critical-rated vulnerabilities that were thoroughly reviewed, 1,587 — equivalent to 90.6 percent — were confirmed as genuine security issues rather than false alarms. That level of precision surpasses many traditional tools and processes.

AI Tool Uncovered More Than 10,000 Critical Security Vulnerabilities in Popular Software - Bilde 1

AI Outperforms Nine Out of Ten Human Testers

The findings from Project Glasswing are not an isolated case. The ARTEMIS research study, which pitted ten OSCP-certified human penetration testers against six commercial AI agents, found that the AI systems outperformed nine out of ten humans in the overall ranking. AI identified nine valid vulnerabilities with a success rate of 82 percent, and excelled in particular at large-scale network mapping and parallel scanning.

DARPA has also conducted comparable tests through its Artificial Intelligence Cyber Challenge, where AI systems completed security assessments in a matter of hours — tasks that previously required multiple specialists and up to six months.

AI can accomplish in hours what previously took an entire team six months

Human Expertise Remains Irreplaceable

Despite the impressive numbers, the picture does not make a complete case for replacing security experts with algorithms. According to the research review, human testers are far better at uncovering complex, multi-stage attacks and so-called business logic flaws — errors that arise from the interplay between systems and usage patterns, not just in the code in isolation.

Analysis firm Gartner estimates that by 2027, AI agents will cut in half the time it takes to exploit exposed account credentials — illustrating that the technology is evolving rapidly in both directions: as a defense tool and as a potential threat in the hands of attackers.

The Hybrid Model Emerges as the Clear Path Forward

The prevailing expert consensus, reflected in the research review underpinning this article, is that neither AI nor humans alone are sufficient for modern cybersecurity. AI handles volume and speed; human experts contribute contextual understanding and creative problem-solving. The results from Project Glasswing underscore this point: even with a precision rate above 90 percent, the remaining cases still require human judgment for accurate assessment.

For organizations that depend on the affected software, the message from Digi.no is unambiguous: the scale of undiscovered security vulnerabilities in widely used systems is far greater than most have assumed — and AI has now demonstrated that it can find them at a pace no human team can match alone.