L-SR1: Learned Symmetric-Rank-One Preconditioning

📅 2025-08-17

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

Traditional end-to-end deep learning optimizers suffer from poor generalization and high computational overhead, whereas classical second-order methods (e.g., SR1) are lightweight and efficient but converge slowly and lack data adaptivity. This paper introduces LSR1—the first learnable second-order optimizer—featuring a trainable preconditioning unit that generates data-driven descent directions and an end-to-end learnable, positive semidefinite rank-one correction matrix satisfying the secant condition. Trained self-supervisedly without labeled data, LSR1 theoretically preserves key SR1 update properties, ensuring both convergence guarantees and strong generalization. In monocular human mesh recovery and other inverse problems, LSR1 significantly outperforms existing learned optimizers: it converges faster, uses fewer parameters, generalizes robustly across unseen scenarios, and operates effectively without task-specific fine-tuning.

Technology Category

Application Category

📝 Abstract

End-to-end deep learning has achieved impressive results but remains limited by its reliance on large labeled datasets, poor generalization to unseen scenarios, and growing computational demands. In contrast, classical optimization methods are data-efficient and lightweight but often suffer from slow convergence. While learned optimizers offer a promising fusion of both worlds, most focus on first-order methods, leaving learned second-order approaches largely unexplored. We propose a novel learned second-order optimizer that introduces a trainable preconditioning unit to enhance the classical Symmetric-Rank-One (SR1) algorithm. This unit generates data-driven vectors used to construct positive semi-definite rank-one matrices, aligned with the secant constraint via a learned projection. Our method is evaluated through analytic experiments and on the real-world task of Monocular Human Mesh Recovery (HMR), where it outperforms existing learned optimization-based approaches. Featuring a lightweight model and requiring no annotated data or fine-tuning, our approach offers strong generalization and is well-suited for integration into broader optimization-based frameworks.

Problem

Research questions and friction points this paper is trying to address.

Learned second-order optimizer for data-efficient deep learning

Lightweight preconditioning unit enhancing SR1 algorithm convergence

Generalizable optimization without annotated data or fine-tuning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Learned Symmetric-Rank-One preconditioning algorithm

Trainable preconditioning unit generating data-driven vectors

Learned projection ensuring secant constraint alignment

🔎 Similar Papers

A fast Multiplicative Updates algorithm for Non-negative Matrix Factorization