Advancing Prompt-Based Methods for Replay-Independent General Continual Learning

📅 2025-03-02

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

In the challenging setting of task-agnostic continual learning (GCL) without task boundaries or replay buffers, existing prompt-tuning methods suffer from poor initial performance, weak generalization, and severe catastrophic forgetting. To address these issues, we propose MISA: a novel framework that (i) introduces pretrained-data-driven, forgetting-aware initialization for the first session to optimize prompt initialization, and (ii) designs a non-parametric logit masking mechanism over the output layer to dynamically suppress forgetting of previously learned classes. MISA requires no rehearsal, imposes no dependency on continual-learning-specific hyperparameters, and is fully plug-and-play. Evaluated on CIFAR-100, Tiny-ImageNet, and ImageNet-R, MISA achieves absolute improvements of 18.39%, 22.06%, and 11.96% over state-of-the-art methods, respectively—demonstrating substantial gains in both generalization capability and learning stability under GCL constraints.

Technology Category

Application Category

📝 Abstract

General continual learning (GCL) is a broad concept to describe real-world continual learning (CL) problems, which are often characterized by online data streams without distinct transitions between tasks, i.e., blurry task boundaries. Such requirements result in poor initial performance, limited generalizability, and severe catastrophic forgetting, heavily impacting the effectiveness of mainstream GCL models trained from scratch. While the use of a frozen pretrained backbone with appropriate prompt tuning can partially address these challenges, such prompt-based methods remain suboptimal for CL of remaining tunable parameters on the fly. In this regard, we propose an innovative approach named MISA (Mask and Initial Session Adaption) to advance prompt-based methods in GCL. It includes a forgetting-aware initial session adaption that employs pretraining data to initialize prompt parameters and improve generalizability, as well as a non-parametric logit mask of the output layers to mitigate catastrophic forgetting. Empirical results demonstrate substantial performance gains of our approach compared to recent competitors, especially without a replay buffer (e.g., up to 18.39%, 22.06%, and 11.96% performance lead on CIFAR-100, Tiny-ImageNet, and ImageNet-R, respectively). Moreover, our approach features the plug-in nature for prompt-based methods, independence of replay, ease of implementation, and avoidance of CL-relevant hyperparameters, serving as a strong baseline for GCL research. Our source code is publicly available at https://github.com/kangzhiq/MISA

Problem

Research questions and friction points this paper is trying to address.

Improves general continual learning with blurry task boundaries.

Reduces catastrophic forgetting in online data stream scenarios.

Enhances performance without replay buffers in prompt-based methods.

Innovation

Methods, ideas, or system contributions that make the work stand out.

MISA enhances prompt-based GCL methods.

Uses pretraining data for prompt initialization.

Implements non-parametric logit mask.

🔎 Similar Papers

No similar papers found.