Skip to content
OVEX TECH
Technology & AI

Google Proposes New AI Test for Human-Level Intelligence

Google Proposes New AI Test for Human-Level Intelligence

Google DeepMind Introduces Framework to Measure AI Progress

Google DeepMind has unveiled a new approach to understanding how close artificial intelligence systems are to achieving AGI, or artificial general intelligence. This new method, detailed in a recent paper, aims to settle debates about AI progress by offering a standardized way to test AI capabilities against human performance. It moves away from single scores and instead focuses on a detailed cognitive profile.

The core of this proposal is a “cognitive taxonomy.” This taxonomy breaks down intelligence into 10 key areas, drawing from decades of research in psychology and neuroscience. These areas are designed to mirror how scientists study the human mind. The goal is to create a comprehensive picture of an AI’s strengths and weaknesses, rather than relying on easily manipulated benchmarks.

Understanding the 10 Cognitive Faculties

The 10 cognitive faculties identified by Google DeepMind are:

  • Perception: The ability to see, hear, and read, and not just recognize pixels but truly understand scenes, speech, and text.
  • Generation: The capacity to produce useful outputs like text, speech, or actions.
  • Attention: The skill to focus on important information and ignore distractions, a capability current AI models often lack.
  • Learning: The ability to acquire new knowledge after deployment, similar to how humans learn in real-time.
  • Memory: The capacity to store, retrieve, and importantly, forget outdated information over time.
  • Reasoning: The power to draw logical conclusions through deduction, induction, and other forms of reasoning.
  • Meta-cognition: The awareness of what the AI knows and its own limitations, helping it identify uncertainty.
  • Executive Functions: The abilities needed to plan, control impulses, and adapt strategies to achieve goals.
  • Problem-Solving: The skill to combine perception, reasoning, and planning to tackle new, real-world challenges.
  • Social Cognition: The capacity to understand social cues, infer others’ thoughts, and interact appropriately in social settings.

The framework emphasizes what an AI system can achieve, not the specific technology it uses. Whether it’s a transformer or a newer model, the focus is on the results of its cognitive abilities.

A Three-Stage Testing Process

Google DeepMind proposes a three-stage evaluation process to implement this framework:

  1. Cognitive Assessment: AI systems are put through a series of targeted tasks designed to test each of the 10 faculties. These tasks must be kept private to prevent AI models from memorizing answers during training and should be verified by independent parties.
  2. Human Baselines: The exact same tasks are given to a large, representative group of humans with at least a high school education. This provides a real-world range of human performance for comparison.
  3. Cognitive Profiles: The AI’s performance on each task is plotted against the human performance data. This creates a visual “radar chart” that clearly shows where the AI excels and where it falls short compared to humans.

This method aims to overcome a major issue with current AI testing: data contamination. When AI models are trained on test data, their high scores reflect memorization, not true understanding or intelligence.

Why This Matters: Beyond the Benchmark

This new framework addresses a fundamental problem in AI development: the lack of a clear, agreed-upon definition and measurement of AGI. Different AI labs have varying definitions, making it difficult to track progress objectively. Google’s approach provides a concrete way to measure advancement across multiple dimensions of intelligence.

The paper acknowledges limitations, noting that the taxonomy doesn’t measure speed or “system propensities”—whether an AI is risk-averse or reckless, or if it aligns with human values. Creativity is also noted as difficult to measure directly, though the underlying cognitive processes are covered.

A key challenge remains evaluating complex AI systems that use tools or external resources. Google suggests testing the entire system but designing tasks so that tools don’t unfairly inflate performance, comparing it to giving a human a calculator during an IQ test.

Putting the Framework into Practice

Google isn’t just proposing this as theory; they are backing it with action. They have launched a $200,000 Kaggle hackathon, inviting researchers worldwide to help build the actual evaluation tasks for the framework. The competition focuses on areas like learning, meta-cognition, attention, executive functions, and social cognition, with significant prize money for the best submissions.

This initiative seeks to move beyond the current “vibes” of AI progress, where claims about AGI timelines are hard to verify. By establishing a standardized cognitive testing method, Google aims to bring scientific rigor to the pursuit of artificial general intelligence. The results of this approach could lead to a clearer understanding of AI capabilities and guide future development responsibly.


Source: Google Just Changed the Definition of AGI (YouTube)

Leave a Reply

Your email address will not be published. Required fields are marked *

Written by

John Digweed

2,801 articles

Life-long learner.