Budgeted Online Continual Learning by Adaptive Layer Freezing and Frequency-based Sampling

📅 2024-10-19

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

183K/year

🤖 AI Summary

In online continual learning (CL), existing methods suffer from incomparable computational (FLOPs) and memory (bytes) budgets due to heterogeneous single-pass training constraints and varying replay buffer sizes; moreover, implicit overheads—such as logit caching and model duplication—are frequently overlooked. This work proposes the first fair evaluation framework jointly constrained by total FLOPs and memory capacity. Methodologically: (1) we introduce a gradient-sensitivity-driven adaptive layer freezing mechanism to eliminate redundant computation; (2) we propose a frequency-weighted replay sampling strategy to enhance knowledge reuse efficiency. Evaluated on CIFAR-10/100, CLEAR-10/100, and ImageNet-1K, our approach consistently surpasses state-of-the-art methods under identical total resource budgets, achieving simultaneous gains in classification accuracy and inference efficiency.

Technology Category

Application Category

📝 Abstract

The majority of online continual learning (CL) advocates single-epoch training and imposes restrictions on the size of replay memory. However, single-epoch training would incur a different amount of computations per CL algorithm, and the additional storage cost to store logit or model in addition to replay memory is largely ignored in calculating the storage budget. Arguing different computational and storage budgets hinder fair comparison among CL algorithms in practice, we propose to use floating point operations (FLOPs) and total memory size in Byte as a metric for computational and memory budgets, respectively, to compare and develop CL algorithms in the same 'total resource budget.' To improve a CL method in a limited total budget, we propose adaptive layer freezing that does not update the layers for less informative batches to reduce computational costs with a negligible loss of accuracy. In addition, we propose a memory retrieval method that allows the model to learn the same amount of knowledge as using random retrieval in fewer iterations. Empirical validations on the CIFAR-10/100, CLEAR-10/100, and ImageNet-1K datasets demonstrate that the proposed approach outperforms the state-of-the-art methods within the same total budget

Problem

Research questions and friction points this paper is trying to address.

Fair comparison of CL algorithms under computational and storage constraints.

Reducing computational costs via adaptive layer freezing for less informative batches.

Enhancing memory retrieval efficiency to learn more in fewer iterations.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive layer freezing reduces computational costs.

Frequency-based sampling enhances memory retrieval efficiency.

FLOPs and memory size standardize resource budget metrics.

🔎 Similar Papers

No similar papers found.