Sculpting [CLS] Features for Pre-Trained Model-Based Class-Incremental Learning

📅 2025-02-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses catastrophic forgetting and the trade-off between generalization and adaptability to novel classes in class-incremental learning. To this end, we propose LuCA, a framework that achieves stability–plasticity balance by fine-tuning only the final-layer [CLS] token representation. Its core innovations are: (i) the Adaptation–Calibration Coupled module (LuCA), and (ii) Token-level Sparse Calibration (TOSCA). LuCA integrates parameter-efficient fine-tuning (PEFT), lightweight adapters, dynamic calibration, and orthogonally designed modules—without modifying the backbone network. Evaluated on multiple standard incremental learning benchmarks, LuCA achieves state-of-the-art performance while using only 1/8 the parameters of prior best methods, significantly reducing both training and inference overhead.

Technology Category

Application Category

📝 Abstract
Class-incremental learning requires models to continually acquire knowledge of new classes without forgetting old ones. Although pre-trained models have demonstrated strong performance in class-incremental learning, they remain susceptible to catastrophic forgetting when learning new concepts. Excessive plasticity in the models breaks generalizability and causes forgetting, while strong stability results in insufficient adaptation to new classes. This necessitates effective adaptation with minimal modifications to preserve the general knowledge of pre-trained models. To address this challenge, we first introduce a new parameter-efficient fine-tuning module 'Learn and Calibrate', or LuCA, designed to acquire knowledge through an adapter-calibrator couple, enabling effective adaptation with well-refined feature representations. Second, for each learning session, we deploy a sparse LuCA module on top of the last token just before the classifier, which we refer to as 'Token-level Sparse Calibration and Adaptation', or TOSCA. This strategic design improves the orthogonality between the modules and significantly reduces both training and inference complexity. By leaving the generalization capabilities of the pre-trained models intact and adapting exclusively via the last token, our approach achieves a harmonious balance between stability and plasticity. Extensive experiments demonstrate TOSCA's state-of-the-art performance while introducing ~8 times fewer parameters compared to prior methods.
Problem

Research questions and friction points this paper is trying to address.

Address catastrophic forgetting in class-incremental learning
Balance model stability and plasticity in adaptation
Enhance feature representation with minimal parameter modifications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parameter-efficient fine-tuning module
Token-level Sparse Calibration and Adaptation
Reduces training and inference complexity
🔎 Similar Papers
No similar papers found.