An Adaptive Method Stabilizing Activations for Enhanced Generalization

📅 2024-12-09
🏛️ 2024 IEEE International Conference on Data Mining Workshops (ICDMW)
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing optimizers suffer from unstable neuron outputs and insufficient generalization: Adam converges rapidly but generalizes poorly, whereas SGD generalizes well yet converges slowly. This paper proposes ADAACT, a novel optimization algorithm that—uniquely—dynamically adjusts the learning rate at the neuron level based on activation variance. ADAACT employs a gradient-driven, lightweight, backpropagation-compatible adaptation mechanism. It significantly improves output stability while achieving state-of-the-art generalization performance on CIFAR and ImageNet; its convergence speed approaches that of Adam, and its generalization capability matches or exceeds that of SGD, with negligible additional training overhead. The core innovation lies in integrating activation variance into the learning rate adaptation paradigm—thereby bridging the long-standing trade-off between convergence efficiency and generalization ability.

Technology Category

Application Category

📝 Abstract
We introduce ADAACT, a novel optimization algorithm that adjusts learning rates according to activation variance. Our method enhances the stability of neuron outputs by incorporating neuron-wise adaptivity during the training process, which subsequently leads to better generalization— a complementary approach to conventional activation regularization methods. Experimental results demonstrate ADAACT’s competitive performance across standard image classification benchmarks. We evaluate ADAACT on CIFAR and ImageNet, comparing it with other state-of-the-art methods. Importantly, ADAACT effectively bridges the gap between the convergence speed of Adam and the strong generalization capabilities of SGD, all while maintaining competitive execution times.
Problem

Research questions and friction points this paper is trying to address.

Stabilizes neuron activations for better generalization
Bridges gap between Adam's speed and SGD's generalization
Adjusts learning rates based on activation variance
Innovation

Methods, ideas, or system contributions that make the work stand out.

AdaAct adjusts learning rates adaptively
Enhances neuron output stability
Bridges gap between Adam and SGD
🔎 Similar Papers
No similar papers found.
H
Hyunseok Seung
Department of Statistics, University of Georgia
J
Jaewoo Lee
School of Computing, University of Georgia
Hyunsuk Ko
Hyunsuk Ko
Associate Professor, School of Electrical Engineering, Hanyang University ERICA
Video CodingDeep LearningComputer VisionImage Quality Assessment