🤖 AI Summary
Standard parallel rehearsal—jointly training on new and old data—is widely used in continual learning to mitigate catastrophic forgetting, yet its optimality remains questionable.
Method: Inspired by human sequential review learning, we establish the first theoretical analysis framework for replay in overparameterized linear models, systematically comparing concurrent versus sequential rehearsal in terms of forgetting and generalization. Building on theoretical insights, we propose an adaptive hybrid rehearsal strategy that dynamically selects between sequential and concurrent replay based on task similarity.
Contribution/Results: Our theory quantifies generalization error and forgetting, revealing that sequential rehearsal dominates concurrent rehearsal under large task divergence. Empirical evaluation on deep networks across multiple benchmarks confirms that the adaptive method consistently outperforms standard parallel rehearsal—validating both the theoretical predictions and the practical efficacy of theory-driven design.
📝 Abstract
Rehearsal-based methods have shown superior performance in addressing catastrophic forgetting in continual learning (CL) by storing and training on a subset of past data alongside new data in current task. While such a concurrent rehearsal strategy is widely used, it remains unclear if this approach is always optimal. Inspired by human learning, where sequentially revisiting tasks helps mitigate forgetting, we explore whether sequential rehearsal can offer greater benefits for CL compared to standard concurrent rehearsal. To address this question, we conduct a theoretical analysis of rehearsal-based CL in overparameterized linear models, comparing two strategies: 1) Concurrent Rehearsal, where past and new data are trained together, and 2) Sequential Rehearsal, where new data is trained first, followed by revisiting past data sequentially. By explicitly characterizing forgetting and generalization error, we show that sequential rehearsal performs better when tasks are less similar. These insights further motivate a novel Hybrid Rehearsal method, which trains similar tasks concurrently and revisits dissimilar tasks sequentially. We characterize its forgetting and generalization performance, and our experiments with deep neural networks further confirm that the hybrid approach outperforms standard concurrent rehearsal. This work provides the first comprehensive theoretical analysis of rehearsal-based CL.