🤖 AI Summary
This paper studies Online Convex Optimization (OCO) under delayed feedback, focusing on strongly convex and exp-concave loss functions. To overcome the limitation of existing regret bounds—namely, the suboptimal $d_{max} ln T$ dependence on maximum delay—we propose the first delay-adaptive algorithm. For exp-concave losses, it achieves the tight regret bound $min{d_{max} n ln T,, sqrt{d_{ ext{tot}}}}$; for strongly convex losses, it attains $min{sigma_{max} ln T,, sqrt{d_{ ext{tot}}}}$, where $sigma_{max}$ denotes the maximum number of missing observations. Methodologically, we extend the Follow-the-Regularized-Leader and Online Newton Step frameworks, and design a truncated Vovk-Azoury-Warmuth predictor to address unconstrained online linear regression under delays. Experiments demonstrate consistent and significant improvements over baselines across diverse delay patterns and loss structures.
📝 Abstract
In this work, we study the online convex optimization problem with curved losses and delayed feedback. When losses are strongly convex, existing approaches obtain regret bounds of order $d_{max} ln T$, where $d_{max}$ is the maximum delay and $T$ is the time horizon. However, in many cases, this guarantee can be much worse than $sqrt{d_{mathrm{tot}}}$ as obtained by a delayed version of online gradient descent, where $d_{mathrm{tot}}$ is the total delay. We bridge this gap by proposing a variant of follow-the-regularized-leader that obtains regret of order $min{sigma_{max}ln T, sqrt{d_{mathrm{tot}}}}$, where $sigma_{max}$ is the maximum number of missing observations. We then consider exp-concave losses and extend the Online Newton Step algorithm to handle delays with an adaptive learning rate tuning, achieving regret $min{d_{max} nln T, sqrt{d_{mathrm{tot}}}}$ where $n$ is the dimension. To our knowledge, this is the first algorithm to achieve such a regret bound for exp-concave losses. We further consider the problem of unconstrained online linear regression and achieve a similar guarantee by designing a variant of the Vovk-Azoury-Warmuth forecaster with a clipping trick. Finally, we implement our algorithms and conduct experiments under various types of delay and losses, showing an improved performance over existing methods.