CL-LoRA: Continual Low-Rank Adaptation for Rehearsal-Free Class-Incremental Learning

📅 2025-05-30

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

To address parameter redundancy and catastrophic forgetting across tasks in replay-free continual learning with pretrained models, this paper proposes a dual-adapter collaborative architecture. First, a task-shared adapter employs orthogonal initialization and gradient-redistributed knowledge distillation to enhance cross-task knowledge reuse. Second, a task-specific adapter leverages learnable block-wise weights for fine-grained task discrimination. Integrating LoRA with the parameter-efficient fine-tuning (PEFT) paradigm, the method avoids storing historical samples or introducing full-sized adapters per task. Evaluated on multiple benchmarks, it achieves significant improvements in classification accuracy while reducing both training and inference overhead. The approach thus balances model efficiency, scalability, and sustained generalization capability—without compromising performance or requiring additional memory for past data.

Technology Category

Application Category

📝 Abstract

Class-Incremental Learning (CIL) aims to learn new classes sequentially while retaining the knowledge of previously learned classes. Recently, pre-trained models (PTMs) combined with parameter-efficient fine-tuning (PEFT) have shown remarkable performance in rehearsal-free CIL without requiring exemplars from previous tasks. However, existing adapter-based methods, which incorporate lightweight learnable modules into PTMs for CIL, create new adapters for each new task, leading to both parameter redundancy and failure to leverage shared knowledge across tasks. In this work, we propose ContinuaL Low-Rank Adaptation (CL-LoRA), which introduces a novel dual-adapter architecture combining extbf{task-shared adapters} to learn cross-task knowledge and extbf{task-specific adapters} to capture unique features of each new task. Specifically, the shared adapters utilize random orthogonal matrices and leverage knowledge distillation with gradient reassignment to preserve essential shared knowledge. In addition, we introduce learnable block-wise weights for task-specific adapters, which mitigate inter-task interference while maintaining the model's plasticity. We demonstrate CL-LoRA consistently achieves promising performance under multiple benchmarks with reduced training and inference computation, establishing a more efficient and scalable paradigm for continual learning with pre-trained models.

Problem

Research questions and friction points this paper is trying to address.

Addresses parameter redundancy in adapter-based CIL methods

Enhances cross-task knowledge sharing in continual learning

Reduces inter-task interference while maintaining model plasticity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-adapter architecture for task-shared and task-specific learning

Random orthogonal matrices for shared knowledge distillation

Learnable block-wise weights to reduce inter-task interference

🔎 Similar Papers

Semantically-Shifted Incremental Adapter-Tuning is A Continual ViTransformer