LCA: Local Classifier Alignment for Continual Learning

📅 2026-03-10

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

This work addresses catastrophic forgetting in continual learning, which arises from semantic misalignment between the evolving backbone network and task-specific classifiers. To mitigate this issue, the authors propose a Local Classifier Alignment (LCA) loss that dynamically aligns classifiers with the continuously updated feature representations of the backbone during training, thereby enhancing model consistency and stability. The method is seamlessly integrated into a pretrained-model-based continual learning framework and optimized end-to-end in conjunction with a model fusion strategy. Extensive experiments demonstrate that the proposed approach significantly outperforms existing state-of-the-art methods across multiple standard continual learning benchmarks, achieving notably superior performance on several key metrics.

Technology Category

Application Category

📝 Abstract

A fundamental requirement for intelligent systems is the ability to learn continuously under changing environments. However, models trained in this regime often suffer from catastrophic forgetting. Leveraging pre-trained models has recently emerged as a promising solution, since their generalized feature extractors enable faster and more robust adaptation. While some earlier works mitigate forgetting by fine-tuning only on the first task, this approach quickly deteriorates as the number of tasks grows and the data distributions diverge. More recent research instead seeks to consolidate task knowledge into a unified backbone, or adapting the backbone as new tasks arrive. However, such approaches may create a (potential) \textit{mismatch} between task-specific classifiers and the adapted backbone. To address this issue, we propose a novel \textit{Local Classifier Alignment} (LCA) loss to better align the classifier with backbone. Theoretically, we show that this LCA loss can enable the classifier to not only generalize well for all observed tasks, but also improve robustness. Furthermore, we develop a complete solution for continual learning, following the model merging approach and using LCA. Extensive experiments on several standard benchmarks demonstrate that our method often achieves leading performance, sometimes surpasses the state-of-the-art methods with a large margin.

Problem

Research questions and friction points this paper is trying to address.

continual learning

catastrophic forgetting

classifier-backbone mismatch

pre-trained models

task adaptation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Local Classifier Alignment

Continual Learning

Catastrophic Forgetting