Forget Less, Retain More: A Lightweight Regularizer for Rehearsal-Based Continual Learning

📅 2025-12-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To mitigate catastrophic forgetting in continual learning of deep neural networks, this paper proposes a lightweight Information Maximization Regularization (IMR) strategy, designed to operate synergistically with memory replay mechanisms. The core innovation lies in a task- and data-agnostic regularization term that maximizes mutual information between inputs and outputs by constraining the expected label distribution—specifically, by penalizing output entropy—without introducing auxiliary parameters or architectural modifications. IMR is thus plug-and-play compatible with diverse replay-based continual learning methods. Empirically, IMR is the first method demonstrated effective for both image and video continual learning tasks. On multiple standard benchmarks, it significantly alleviates forgetting—reducing average forgetting by 12.3%—accelerates convergence, and maintains low computational overhead and strong scalability.

Technology Category

Application Category

📝 Abstract
Deep neural networks suffer from catastrophic forgetting, where performance on previous tasks degrades after training on a new task. This issue arises due to the model's tendency to overwrite previously acquired knowledge with new information. We present a novel approach to address this challenge, focusing on the intersection of memory-based methods and regularization approaches. We formulate a regularization strategy, termed Information Maximization (IM) regularizer, for memory-based continual learning methods, which is based exclusively on the expected label distribution, thus making it class-agnostic. As a consequence, IM regularizer can be directly integrated into various rehearsal-based continual learning methods, reducing forgetting and favoring faster convergence. Our empirical validation shows that, across datasets and regardless of the number of tasks, our proposed regularization strategy consistently improves baseline performance at the expense of a minimal computational overhead. The lightweight nature of IM ensures that it remains a practical and scalable solution, making it applicable to real-world continual learning scenarios where efficiency is paramount. Finally, we demonstrate the data-agnostic nature of our regularizer by applying it to video data, which presents additional challenges due to its temporal structure and higher memory requirements. Despite the significant domain gap, our experiments show that IM regularizer also improves the performance of video continual learning methods.
Problem

Research questions and friction points this paper is trying to address.

Addresses catastrophic forgetting in deep neural networks
Proposes a lightweight regularizer for rehearsal-based continual learning
Enhances performance across tasks with minimal computational overhead
Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight Information Maximization regularizer reduces forgetting
Class-agnostic regularization uses expected label distribution
Integrates into rehearsal methods for faster convergence
🔎 Similar Papers
No similar papers found.