Back to Blog List

Google breaks through the bottleneck of AI forgetting! "Nested learning" allows models to continuously evolve like the human brain

11/10/2025
Author: Lydia
Category: News
Google breaks through the bottleneck of AI forgetting! "Nested learning" allows models to continuously evolve like the human brain

Recently, Google published a paper titled "Nested Learning: The Illusion of Deep Learning Architectures" at NeurIPS 2025, proposing a novel "nested learning" paradigm that fundamentally rethinks how AI learns. This breakthrough may mark a crucial step towards AI "truly evolving like the brain."

01 Solving the AI ​​"Catastrophic Forgetting" Problem

Current large language models perform impressively, but their knowledge accumulation has a clear ceiling. Their knowledge is either limited to pre-training data or constrained by a limited context window, unable to continuously learn new skills without forgetting old knowledge through neuroplasticity like the human brain.

The most direct solution—continuously updating model parameters with new data—often leads to "catastrophic forgetting," where the model's performance on old tasks deteriorates significantly after learning new tasks.

Traditionally, researchers have mitigated this problem through two approaches: improving model architecture or optimizing training algorithms; however, these two methods have been viewed as separate parts, and this fragmented perspective has hindered the establishment of a unified and efficient learning system.

Google's nested learning paradigm breaks this conventional thinking, viewing model architecture and optimization algorithms as a unified, nested system, addressing the forgetting problem at a more fundamental level.

02 Nested Learning: Enabling AI to Learn at Multiple Scales Like the Human Brain

The core idea of ​​nested learning is quite simple: a complex machine learning model is essentially a set of nested or parallel optimization problems, each with its own context flow and update frequency. This is similar to the human brain's learning mechanism. Our instantaneous memory of immediate information updates extremely quickly, short-term memory for exam preparation updates at a slightly slower pace, while long-term knowledge that constitutes our worldview updates very slowly. Nested learning applies this insight to AI, introducing the key concept: update frequency. Each component in the model, whether it's the weight parameters or the momentum term in the optimizer, has its own unique update frequency. The slowest-updating component is responsible for accumulating long-term, stable, and abstract knowledge into the parameters, forming the model's "worldview." This multi-rate learning system allows the model to flexibly absorb new information without interfering with core knowledge.

03 Two Major Technological Innovations: Deep Optimizer and Continuous Memory System

Based on the nested learning paradigm, the Google research team proposed two core technological improvements, providing a practical path for theory.

  • Deep Optimizer: Enabling Learning Capabilities in the Optimization Process

Nested learning treats the optimizer itself as a learnable "associative memory module." Traditional optimizers rely on simple dot product similarity, failing to consider the complex relationships between different data samples. By adjusting the optimization objective to a more standardized loss metric, a new momentum formula can be derived, making the optimizer more robust to noisy data. This "deep optimizer" can more intelligently guide the entire model's learning process.

  • Continuous Memory System: Achieving Dynamic Memory Management

In the traditional Transformer, the sequence model acts as short-term memory, preserving immediate context; the feedforward network acts as long-term memory, storing pre-trained knowledge. Nested learning extends this concept to a continuous memory system, where memory is viewed as a spectrum composed of a series of modules, each updated at a specific frequency, creating a richer and more efficient memory system for continuous learning. This design enables the model to dynamically allocate and manage knowledge across memory modules at different time scales based on the importance and stability of information, fundamentally solving the problem of catastrophic forgetting.

04 Hope Validation Model: Significantly Outperforming Existing Architectures

To validate nested learning theory, Google developed the Hope validation model—a self-modifying recurrent network based on the Titans architecture, integrating a continuum memory system.

Experimental results show that Hope exhibits lower perplexity and higher accuracy in language modeling and commonsense reasoning tasks, outperforming modern recurrent models and the standard Transformer.

Especially in the long-context "needle-in-a-haystack" task, Hope demonstrates superior memory management capabilities, proving that CMS provides a more efficient method for processing extended information sequences.

These results not only validate the effectiveness of the nested learning paradigm but also demonstrate its enormous potential in handling complex tasks.

05 Implications of Nested Learning for AI Development

The continuous learning capability brought by nested learning enables AI systems to quickly adapt to the localized needs of different markets without forgetting core knowledge. For example, an AI customer service system that performs well in the European and American markets can quickly learn the more polite and conversational communication styles of the East Asian market through continuous learning.

From a technical perspective, the nested learning paradigm aligns perfectly with the "scenario-centric" development path of domestic AI companies. Whether it's the overseas returnees in Guizhou's big data industry attempting to combine international technology with local needs, or the development trend of AI companies in various fields delving into vertical industries, nested learning provides strong technical support.

For domestic AI hardware companies, the efficient and continuous learning capabilities under the nested learning framework help better balance global standards and localized needs when exporting products, creating truly globally competitive products in areas such as smart homes and consumer electronics.

From academic research to industrial application, nested learning may take several years. However, it is certain that AI is transforming from a "static knowledge base" that can only passively reflect the statistical patterns of training data into an organism capable of continuous self-improvement and adaptation to the environment.

Google's breakthrough in nested learning is not only a technological milestone but also a significant step towards the development of AI towards general artificial intelligence. As this paradigm matures, application areas requiring lifelong learning, such as autonomous driving, personalized medical assistants, and adaptive education systems, will usher in entirely new possibilities.

References:

https://openreview.net/forum?id=nbMeRvNb7A;

https://research.google/blog/introducing-nested-learning-a-new-ml-paradigm-for-continual-learning/

Share this article

Leave your comment

  • No comments yet.
Ad
Ad not loaded or not displayed

Recommended AI Tools

Carefully selected AI tools to improve your work, study, and live efficiency.

Virtual Try On

AI-powered virtual try-on for clothes, hairstyles, and accessories.

SPONSORED
Image to Image AI

AI-powered image transformation for professional creative workflows.

SPONSORED
 Lipsync Studio

Transform your videos with advanced lip sync technology.

61.2K
SPONSORED
OpenArt

OpenArt is a versatile AI image and video generator.

SPONSORED
Grayscale Image

Grayscale Image is a free online tool for converting color photos to black and white with professional controls.

SPONSORED
Circle Crop Image

Circle Crop Image is a free online tool for creating round images.

SPONSORED
SAM TTS

Experience the nostalgic Microsoft SAM voice from Windows XP in your browser.

23.2K
SPONSORED

Related Articles

Kimi Linear emerges: revolutionizing the attention architecture of Transformer, boosting long text processing efficiency by 6 times.
News
10/31/2025
Kimi Linear emerges: revolutionizing the attention architecture of Transformer, boosting long text processing efficiency by 6 times.
Author: Kimi Lv

A major breakthrough has been achieved in the core architecture of large-scale models! The release of Kimi Linear marks the first time that linear attention technology has comprehensively surpassed and significantly outperformed the traditional Transformer full-attention model in both performance and efficiency. This "win-win" achievement is expected to significantly reduce the computational barriers and costs for long text processing, complex reasoning, and AI agent applications, potentially changing the competitive landscape of underlying technologies for large-scale models.

In-depth analysis of OpenAI Polaris Alpha technology: A key sequel to the GPT-5.1 leak incident
News
11/12/2025
In-depth analysis of OpenAI Polaris Alpha technology: A key sequel to the GPT-5.1 leak incident
Author: Lydia

Over the past week, the AI ​​community's attention has been drawn to a mysterious model that quietly emerged on the OpenRouter platform—Polaris Alpha. As a direct continuation of yesterday's discussion of the GPT-5.1 leak, this suddenly appearing model brings more technical details and strategic signals worthy of in-depth exploration.

Grokipedia - xAI Launches New AI Knowledge Platform to Challenge Traditional Encyclopedias with AI Revolution
AI
10/28/2025
Grokipedia - xAI Launches New AI Knowledge Platform to Challenge Traditional Encyclopedias with AI Revolution
Author: Lucas

A new paradigm in knowledge acquisition has arrived, this time powered by AI.

2025, looking at the evolution of artificial intelligence
AI
4/24/2025
2025, looking at the evolution of artificial intelligence
Author: Q Yang

Standing at this moment in 2025, when we look back at the development journey of artificial intelligence, we witness how this revolutionary technology has reshaped every aspect of human society. From initial theoretical concepts to today's practical applications, each step forward in AI technology has changed the way we live. Let's revisit this fascinating journey together.

Most Popular AI Tools

Pollo AI

Pollo AI is a versatile AI image and video generator.

FLUX API - PiAPI
5% offCode:AIWITHME

FLUX API by PiAPI offers advanced image generation capabilities.

Typeless

Speak naturally, and Typeless will turn your words into polished messages, emails, and documents that read like you carefully typed them.

627.7K
Midjourney API by PiAPI
5% offCode:AIWITHME

Transform text into stunning images with Midjourney API.

Klap
30% offCode:AIWITHME

Klap transforms long videos into engaging shorts effortlessly.

458.4K
Magic Patterns

Magic Patterns is an AI design tool for product teams.

Base44

Base44 is an AI-powered platform for building fully-functional apps with no code required.

105.8K
LogoAi
30% offCode:aiwithme

Create a stunning logo effortlessly with LogoAi.