Integrating Dual Prototypes for Task-Wise Adaption in Pre-Trained Model-Based Class-Incremental Learning

๐Ÿ“… 2024-11-26
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To mitigate catastrophic forgetting in pretrained models under class-incremental learning, this paper proposes a Dual-Prototype Adapter (DPA) framework. Methodologically: (1) it introduces task-level lightweight adapters to avoid full-parameter fine-tuning; (2) it establishes a dual-prototype mechanismโ€”where the original prototype enables task-index inference for dynamic adapter selection, and the enhanced prototype improves discriminability among highly correlated classes; (3) it incorporates a center-adaptation loss to jointly optimize intra-class compactness and inter-class separability. The framework supports test-time adaptive adapter selection, achieving state-of-the-art performance across multiple benchmarks. It significantly alleviates forgetting while preserving strong transferability and incremental generalization capability.

Technology Category

Application Category

๐Ÿ“ Abstract
Class-incremental learning (CIL) aims to acquire new classes while conserving historical knowledge incrementally. Despite existing pre-trained model (PTM) based methods performing excellently in CIL, it is better to fine-tune them on downstream incremental tasks with massive patterns unknown to PTMs. However, using task streams for fine-tuning could lead to catastrophic forgetting that will erase the knowledge in PTMs. This paper proposes the Dual Prototype network for Task-wise Adaption (DPTA) of PTM-based CIL. For each incremental learning task, a task-wise adapter module is built to fine-tune the PTM, where the center-adapt loss forces the representation to be more centrally clustered and class separable. The dual prototype network improves the prediction process by enabling test-time adapter selection, where the raw prototypes deduce several possible task indexes of test samples to select suitable adapter modules for PTM, and the augmented prototypes that could separate highly correlated classes are utilized to determine the final result. Experiments on several benchmark datasets demonstrate the state-of-the-art performance of DPTA. The code will be open-sourced after the paper is published.
Problem

Research questions and friction points this paper is trying to address.

Addresses catastrophic forgetting in class-incremental learning with pre-trained models.
Proposes a Dual Prototype network for task-wise adaptation in incremental tasks.
Enhances class separability and representation clustering in incremental learning.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual Prototype network for task-wise adaption
Task-wise adapter module for fine-tuning PTMs
Test-time adapter selection for improved prediction
Zhiming Xu
Zhiming Xu
University of Virginia
llm inferencemachine learning system
Suorong Yang
Suorong Yang
Nanjing University
Computer VisionDeep LearningMultimodal Learning
B
Baile Xu
National Key Laboratory for Novel Software Technology, Nanjing University, China; School of Artificial Intelligence, Nanjing University, China
J
Jian Zhao
School of Electronic Science and Engineering, Nanjing University, China
Furao Shen
Furao Shen
Department of Computer Science & Technology, Nanjing University
Neural NetworksRobotic Intelligence