Technology & AI

AI Progress Accelerates: New Models Shatter Benchmarks

by John Digweed · 2 hours ago · 7 mins read · 0 Views

AI Progress Accelerates: New Models Shatter Benchmarks

AI’s Breakneck Pace: A Chart Rewrites the Future of Work

A striking chart, initially published by the nonprofit organization Meter Research, is sending ripples of both excitement and apprehension through the AI community. It visualizes the rapid advancement of AI agents by measuring how quickly they can complete tasks that previously required significant human expert time. The latest data, particularly from Anthropic’s Claude Opus 4.6, suggests an acceleration in AI capabilities that is outpacing even recent, aggressive predictions.

Decoding the “Scary” Chart: What It Actually Measures

The core of the Meter Research chart isn’t about how long an AI takes to complete a task, but rather how much human labor it replaces. Human experts are given a range of tasks across fields like engineering, coding, cybersecurity, and machine learning, and the time they take to complete them is recorded. The AI’s performance is then measured against this human benchmark. The chart often displays two metrics: the time it takes for an AI to succeed on 50% of tasks (a 50% success rate) and the time for an 80% success rate. A lower number on the Y-axis indicates that the AI can accomplish a task in a significantly shorter time, effectively replacing more human hours.

Early iterations of this chart showed AI capabilities doubling roughly every seven months. However, the release of models like Claude Opus 4.5 and subsequently Opus 4.6 has dramatically altered this trajectory. Claude Opus 4.5, for instance, could handle tasks that would take a human expert just over five hours (at a 50% success rate). The latest data point from Claude Opus 4.6 pushes this to approximately 14.5 hours, nearly two full workdays, for the same 50% success rate. This represents a significant leap, suggesting that the progress is not just linear but potentially exponential.

The Accelerating Timeline: From Months to Weeks

The implications of this accelerated progress are profound. If the trend observed since late 2023 is anything to go by, the time it takes for AI capabilities to double has shrunk to approximately 123 days, or about four months. This is a stark contrast to the previous seven-month doubling period. This acceleration has led prominent figures in the AI field to express concerns about global preparedness.

Industry Leaders Sound the Alarm

Sam Altman, CEO of OpenAI, has voiced his unease, stating, “The world is not prepared.” He anticipates a much faster “takeoff” of highly capable AI models than he and others initially thought, a sentiment that is causing stress and anxiety. This sentiment is echoed by others. The creator of Claude Code, for example, has declared that “coding is solved,” suggesting that traditional methods of learning and practicing coding are becoming obsolete. Sam Altman has similarly stated that his own methods of software development are now “effectively completely irrelevant,” and that writing C++ code by hand is a practice of the past. The implication is that Artificial General Intelligence (AGI) may be closer than anticipated, with superintelligence not far behind, and the gap between them potentially very short due to this rapid development cycle.

Further evidence comes from the fact that models are not only performing tasks but also contributing to their own development. GPT-5.3 CodeX was reportedly involved in its own code development during training. Similarly, the creator of Claude Code claims that most, if not all, new additions to Claude Code are authored by the AI models themselves, including Opus 4.6. Elon Musk has declared that we have entered the singularity, and Dario Amodei, CEO of Anthropic, has indicated that we are nearing the “end of the exponential” phase, implying a rapid approach to a new plateau of capabilities.

Real-World Impact: Automation Beyond Imagination

The impact of these advancements is already being felt. The article’s author shares a personal anecdote of using Opus 4.6 to rebuild their website, Natural20.com, an AI-powered news aggregator. The AI completed the deployment, setup, and initial project build in just four hours, a task that would typically take a human expert one to two days. More impressively, the AI not only completed the task but also automated the entire process for ongoing content aggregation and ranking, creating its own SQL database and setting up a system for continuous operation. This demonstrates a key aspect often missed by the charts: AI agents are not just performing one-off tasks but are increasingly capable of automating complex, ongoing processes.

Another compelling example involves an accounting task that the author had been dreading. By feeding financial data into Opus 4.6, the entire project, which involved balancing payments, invoices, and identifying complex relationships between transactions, was completed in 30-40 minutes while the author was playing a video game. This task, which would have taken a human expert potentially hundreds of dollars and significant focused effort, was not only completed but also automated for future use. The AI intuitively understood custom notations and created a system for ongoing financial management.

The “Scribe” Analogy: Coding as the New Literacy

The shift in the coding landscape is frequently compared to the invention of the printing press. Before the printing press, only a select few (scribes) could write. After its invention, literacy became widespread, transforming society. Similarly, while not everyone became a master writer, the ability to read and write became a fundamental skill. In the future, coding may become similarly accessible. While not everyone will be a master software architect, the ability to leverage AI agents to create sophisticated software will become a new form of literacy. The distinction will likely shift from “coder” to “great builder,” with the ability to effectively prompt, guide, and iterate with AI agents being the key differentiator.

Caveats and The Road Ahead

Despite the impressive advancements, the data is not without its complexities. The error bars on the Meter Research chart are significant, with Opus 4.6’s 14.5-hour estimate having a potential range from 6 to 98 hours. Critics also point out that human perception of task difficulty doesn’t always align with AI capabilities, and that improvements in one domain don’t automatically translate to others. However, researchers are observing crossover effects, where training models on one skill, like coding, can lead to improvements in others, such as math, suggesting a more generalizable intelligence is emerging.

Furthermore, Anthropic’s research indicates that users are increasingly trusting AI agents, allowing them to run for longer autonomous sessions. This growing trust, especially among advanced users who actively guide and interrupt the AI when necessary, suggests a symbiotic relationship is forming between humans and AI. The “Bugatti superbike” analogy is apt: AI models possess immense power, but human users are still learning to harness it effectively.

Meter Research itself predicts that 99% of AI research and development could be automated by 2032, potentially leading to a thousandfold to ten millionfold increase in AI efficiency by 2035. While the exact trajectory and timeline remain subjects of debate, the undeniable trend is one of rapid acceleration. The question is no longer *if* AI will change everything, but *when* and *how fast*.

Why This Matters

The implications of this accelerating AI progress are far-reaching. For individuals, it suggests a fundamental shift in the skills valued in the workforce. Tasks that are repetitive, data-intensive, or require significant time for completion are increasingly becoming automatable. This necessitates a focus on uniquely human skills like creativity, critical thinking, complex problem-solving, and emotional intelligence. For businesses, it presents an opportunity for unprecedented productivity gains and the automation of complex workflows. However, it also demands adaptation and strategic planning to integrate AI effectively and ethically. For society, the rapid advancement raises critical questions about economic disruption, the future of work, and the need for robust ethical frameworks and governance to navigate the transformative power of advanced AI.

Source: the SCARIEST chart in AI (YouTube)

Leave a Reply Cancel reply

Written by

John Digweed

490 articles

Life-long learner.