Technology & AI

AI Writes Code, Finds Hidden Flaws

by John Digweed · 3 hours ago · 6 mins read · 0 Views

AI Writes Code, Finds Hidden Flaws

AI Cracks Code, Uncovers Deep Security Flaws

A new artificial intelligence model named Claude Mythos, created by Anthropic, has shown an alarming ability to find and exploit software weaknesses. This powerful AI can discover security holes that have remained hidden for years, even in highly secure systems. The discovery has sent ripples through the tech world, raising questions about the future of cybersecurity.

Anthropic stated that Mythos is a “frontier model” capable of surpassing most human experts in finding and exploiting software bugs. In tests, Mythos preview found thousands of serious flaws, including some in every major operating system and web browser. It even found a 27-year-old vulnerability in OpenBSD, a system known for its strong security, and a 16-year-old flaw in FFmpeg, a widely used video tool. The AI also identified multiple vulnerabilities in the Linux kernel, which powers most of the world’s servers.

Mythos: A Double-Edged Sword

The capabilities of Mythos are so advanced that Anthropic has decided not to release it to the public. They fear that if this model fell into the wrong hands, it could be used by malicious actors to launch widespread cyberattacks. The company noted that the AI’s coding skills, developed to be generally powerful, have led to these cybersecurity talents as a side effect. This means that as AI gets better at writing code, it also becomes better at breaking it.

To address this, Anthropic has launched Project Glass Wing. This initiative gives early access to Mythos to specialized cybersecurity experts at a select group of companies. The goal is to allow these experts to find and fix vulnerabilities in their own software before more powerful AI models become widely available. Anthropic believes that models just as capable, or even more so, are on the horizon, and proactive defense is crucial.

A History of Powerful, Restricted AI

This situation echoes past AI developments. In 2019, OpenAI withheld its powerful GPT-2 model due to concerns about its ability to generate fake news and propaganda. At the time, headlines warned of a “robot apocalypse” and AI that was “too scary” to release. More recently, a Google engineer claimed a chatbot had become sentient, sparking public debate. These past events sometimes lead to skepticism when companies announce highly powerful, restricted AI models.

Some experts suggest that these announcements can also serve a marketing purpose, building hype and positioning companies as leaders in AI. However, the creators of Mythos seem genuinely concerned about the potential misuse of their model. By working with cybersecurity teams at major tech firms, Anthropic aims to strengthen defenses before widespread AI exploitation becomes a reality.

Meta and ZAI Release New AI Models

Beyond the security concerns, the past week also saw the release of new, accessible AI models. Meta introduced Muse Spark, its first significant model from its Super Intelligence Labs. While not open-source like its previous Llama models, Muse Spark shows strong performance in certain areas. It particularly excels in understanding figures and data, outperforming models like GPT-4.5 and Gemini 3.1 in this regard.

However, Muse Spark is not a top performer in coding tasks, scoring below models like Opus and Gemini on coding benchmarks. It also performs moderately on multimodal understanding. Despite these limitations, Muse Spark represents a significant step for Meta’s new AI division, placing it among the top AI contenders. The model is also noted for its efficiency, suggesting it could be less expensive to run than other high-performance models.

GLM 5.1: An Open-Source Powerhouse

Perhaps more exciting for developers is the release of GLM 5.1 by ZAI. This model is open-source, meaning its code and weights are freely available under an MIT license. GLM 5.1 achieves state-of-the-art performance in software engineering tasks, even surpassing GPT-4.5 and Claude Opus 4.6 on some coding benchmarks like SWEBench Pro. It also performs well in mathematical tasks, placing it near the top alongside Gemini 3.1 and GPT-4.5.

The availability of GLM 5.1 allows developers to download, modify, and run the model locally. This open approach fosters innovation and allows for custom AI applications. The rapid progress of open-source models to match proprietary ones is a notable trend in the AI field.

Google Enhances Gemini with New Features

Google has also updated its Gemini AI. The Gemini app now features interactive simulations and models, similar to features introduced by OpenAI and Anthropic. Users can create dynamic visualizations with adjustable sliders that change in real-time. For example, a simulation of compound interest can be adjusted to show how different rates and timeframes affect the outcome.

Additionally, Gemini has introduced a “notebooks” feature. This allows users to organize conversations, files, and custom instructions in dedicated spaces, similar to projects in other AI platforms. These notebooks can sync with NotebookLM, offering more advanced research and organization tools. This feature is currently rolling out to paid Gemini users.

AI Video and Avatar Tools Advance

The AI video generation tool, Seed Dance, has finally become available in the U.S. through platforms like Runway and ByteDance’s CapCut app. While some of its most viral features, like generating celebrity likenesses or trademarked content, have been restricted, Seed Dance 2.0 remains a powerful video model. It can generate complex scenes quickly and efficiently, offering a strong alternative as tools like OpenAI’s Sora become less accessible.

HeyGen has also launched its Avatar 5 model, which can create a digital avatar from a 15-second video clip. While voice and lip-syncing still need improvement, the technology allows for realistic avatars that can be used in various content. Users can even choose different visuals for their avatar, such as different clothing or backgrounds, and generate personalized videos by typing in scripts.

Other AI Updates

OpenAI has introduced a new pricing tier for its services at $100 per month, offering increased usage for its CodeX model. This tier sits between its existing Plus and Enterprise plans, providing more capacity for demanding tasks.

Anthropic has launched a “managed agents” feature for Claude. This allows users to connect AI agents to tools like Notion, ClickUp, and Slack, enabling automated workflows based on actions within those platforms. However, Anthropic also announced that Claude subscriptions will no longer cover usage on third-party tools like OpenClaw, leading to frustration among users who relied on this integration.

Perplexity, an AI-powered search engine, now allows users to connect their financial accounts through Plaid. This integration provides read-only access to financial data, enabling users to track spending, monitor loans, and get a consolidated view of their net worth.

Finally, Factory AI has released a desktop application, making its AI tools more accessible beyond the command line interface.

Source: AI News: The Scariest AI Model Ever! (YouTube)

Leave a Reply Cancel reply

Written by

John Digweed

2,654 articles

Life-long learner.