Learning Regularizers: Learning Optimizers that can Regularize

📅 2025-10-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Conventional optimization relies on explicit regularization terms (e.g., SAM, GAM, GSAM) embedded in the loss function to improve generalization and stability—raising questions about whether such mechanisms can be implicitly internalized by learned optimizers (LOs). Method: We train LOs via meta-learning across MNIST, FMNIST, and CIFAR datasets using MLP and CNN architectures, deliberately omitting explicit regularization from the objective. Contribution/Results: We provide the first empirical evidence that LOs autonomously acquire and transfer regularization-like behavior, significantly enhancing generalization performance and convergence stability. Trained LOs consistently outperform unregularized baselines in test accuracy and generalize their implicit regularization effect to unseen tasks. This challenges the long-standing paradigm that regularization must be explicitly encoded in the loss function, offering a novel pathway toward understanding and engineering implicit regularization mechanisms in deep learning.

Technology Category

Application Category

📝 Abstract
Learned Optimizers (LOs), a type of Meta-learning, have gained traction due to their ability to be parameterized and trained for efficient optimization. Traditional gradient-based methods incorporate explicit regularization techniques such as Sharpness-Aware Minimization (SAM), Gradient-norm Aware Minimization (GAM), and Gap-guided Sharpness-Aware Minimization (GSAM) to enhance generalization and convergence. In this work, we explore a fundamental question: extbf{Can regularizers be learned?} We empirically demonstrate that LOs can be trained to learn and internalize the effects of traditional regularization techniques without explicitly applying them to the objective function. We validate this through extensive experiments on standard benchmarks (including MNIST, FMNIST, CIFAR and Neural Networks such as MLP, MLP-Relu and CNN), comparing LOs trained with and without access to explicit regularizers. Regularized LOs consistently outperform their unregularized counterparts in terms of test accuracy and generalization. Furthermore, we show that LOs retain and transfer these regularization effects to new optimization tasks by inherently seeking minima similar to those targeted by these regularizers. Our results suggest that LOs can inherently learn regularization properties, extit{challenging the conventional necessity of explicit optimizee loss regularization.
Problem

Research questions and friction points this paper is trying to address.

Learning optimizers that internalize regularization without explicit application
Demonstrating learned regularizers outperform traditional methods on benchmarks
Transferring learned regularization effects to new optimization tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Learned Optimizers internalize regularization without explicit application
Regularized Learned Optimizers outperform unregularized ones in generalization
Learned Optimizers transfer regularization effects to new tasks
🔎 Similar Papers
No similar papers found.
Suraj Kumar Sahoo
Suraj Kumar Sahoo
PhD, Indian Institute of Science
Graph TheoryCombinatoricsAlgorithm DesignComputational Complexity
N
Narayanan C. Krishnan
Mehta Family School of Data Science and Artificial Intelligence, Department of Data Science, Indian Institute of Technology Palakkad, Kerala, India