Dynamic Regret via Discounted-to-Dynamic Reduction with Applications to Curved Losses and Adam Optimizer

📅 2026-02-09

📈 Citations: 0

✨ Influential: 0

career value

164K/year

🤖 AI Summary

This work addresses the challenge of dynamic regret minimization in non-stationary online learning, where existing Follow-the-Regularized-Leader (FTRL) methods lack sufficient theoretical guarantees for curved losses—such as logistic regression—and Adam-type optimizers. The authors propose a general “discount-to-dynamic” reduction framework that enables modular analysis of dynamic regret bounds for FTRL-style algorithms. This approach unifies and simplifies theoretical derivations in dynamic environments, yielding the first dynamic regret guarantee for online logistic regression, achieving the optimal dynamic regret bound for online linear regression, and extending to an Adam variant with dual discounting mechanisms. Notably, under non-convex and non-smooth settings, the proposed method is shown to attain the optimal convergence rate.

Technology Category

Application Category

📝 Abstract

We study dynamic regret minimization in non-stationary online learning, with a primary focus on follow-the-regularized-leader (FTRL) methods. FTRL is important for curved losses and for understanding adaptive optimizers such as Adam, yet existing dynamic regret analyses are less explored for FTRL. To address this, we build on the discounted-to-dynamic reduction and present a modular way to obtain dynamic regret bounds of FTRL-related problems. Specifically, we focus on two representative curved losses: linear regression and logistic regression. Our method not only simplifies existing proofs for the optimal dynamic regret of online linear regression, but also yields new dynamic regret guarantees for online logistic regression. Beyond online convex optimization, we apply the reduction to analyze the Adam optimizers, obtaining optimal convergence rates in stochastic, non-convex, and non-smooth settings. The reduction also enables a more detailed treatment of Adam with two discount parameters $(\beta_1,\beta_2)$, leading to new results for both clipped and clip-free variants of Adam optimizers.

Problem

Research questions and friction points this paper is trying to address.

dynamic regret

non-stationary online learning

FTRL

curved losses

Adam optimizer

Innovation

Methods, ideas, or system contributions that make the work stand out.

dynamic regret

FTRL

discounted-to-dynamic reduction