Online Learning with Gradient-Variation Interval Regret

📅 2026-06-02

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

This work addresses the performance challenges in non-stationary online learning caused by dynamically changing environments. The authors propose a gradient-variation-based two-level online ensemble method that adaptively tracks multiple problem-dependent quantities over arbitrary time intervals while maintaining a minimax-optimal worst-case regret bound. The key contributions include establishing, for the first time, an interval dynamic regret bound explicitly dependent on gradient variation; designing an adaptive meta-algorithm that requires no prior knowledge of the Lipschitz constant or smoothness parameters; and providing the first piecewise characterization for both interval dynamic regret and its stochastic extension to adversarial optimization. Theoretical analysis demonstrates that the proposed method simultaneously achieves adaptivity and worst-case optimality, and empirical experiments confirm its practical effectiveness.

📝 Abstract

This paper investigates non-stationary online learning using the metric of interval regret, which requires an online algorithm to perform well over every time interval. We propose the first online learning algorithm that achieves an interval regret bound scaling with gradient variation, a fundamental measure of the cumulative change in online function gradients, which relates to various problem-dependent quantities and is closely connected to stochastic optimization and other problems. Our method employs a simple and efficient two-layer online ensemble structure that achieves strong theoretical guarantees. Specifically, it enjoys a regret bound that simultaneously adapts to various problem-dependent quantities while also preserving the minimax-optimal rate in the worst case. Moreover, recognizing the challenge of hyperparameter tuning, we introduce a Lipschitz- and smoothness-agnostic variant that automatically adapts to these potentially unknown constants. This is primarily enabled by a novel Lipschitz-adaptive meta algorithm, which may be of independent interest. Beyond interval regret, our method also yields broader implications: it provides versatile bounds for interval dynamic regret, a stronger measure that competes with changing comparators over any interval, and yields the first piecewise characterization for stochastic extended adversarial optimization. Theoretical findings are validated by experiments.

Problem

Research questions and friction points this paper is trying to address.

online learning

non-stationary

interval regret

gradient variation

dynamic regret

Innovation

Methods, ideas, or system contributions that make the work stand out.

interval regret

gradient variation

online learning