🤖 AI Summary
This paper studies decentralized smooth online convex optimization (SOCO) in multi-agent systems, aiming to minimize the global cumulative cost—comprising individual agent costs, action switching penalties, and inter-agent behavioral discrepancy regularization. We consider a dynamic, time-varying communication graph where agents access only local neighbor information. Under strong convexity and joint smoothness assumptions, we propose ACORD—the first truly distributed algorithm for this setting. Its theoretical contributions include: (i) asymptotic optimality guarantees; (ii) a diminishing competitive ratio gap over time; (iii) logarithmic dependence of computational complexity on the number of agents; and (iv) provable convergence under arbitrary time-varying topologies. Analysis shows ACORD strictly improves upon the prior centralized LPC algorithm while incurring significantly lower computational overhead. Numerical experiments across diverse graph structures validate its convergence, robustness to topology changes, and scalability.
📝 Abstract
We study the multi-agent Smoothed Online Convex Optimization (SOCO) problem, where $N$ agents interact through a communication graph. In each round, each agent $i$ receives a strongly convex hitting cost function $f^i_t$ in an online fashion and selects an action $x^i_t in mathbb{R}^d$. The objective is to minimize the global cumulative cost, which includes the sum of individual hitting costs $f^i_t(x^i_t)$, a temporal"switching cost"for changing decisions, and a spatial"dissimilarity cost"that penalizes deviations in decisions among neighboring agents. We propose the first truly decentralized algorithm ACORD for multi-agent SOCO that provably exhibits asymptotic optimality. Our approach allows each agent to operate using only local information from its immediate neighbors in the graph. For finite-time performance, we establish that the optimality gap in the competitive ratio decreases with time horizon $T$ and can be conveniently tuned based on the per-round computation available to each agent. Our algorithm benefits from a provably scalable computational complexity that depends only logarithmically on the number of agents and almost linearly on their degree within the graph. Moreover, our results hold even when the communication graph changes arbitrarily and adaptively over time. Finally, ACORD, by virtue of its asymptotic-optimality, is shown to be provably superior to the state-of-the-art LPC algorithm, while exhibiting much lower computational complexity. Extensive numerical experiments across various network topologies further corroborate our theoretical claims.