🤖 AI Summary
This work addresses the high computational overhead in decentralized bilevel optimization caused by frequent evaluations of gradients, Jacobians, and Hessians. To mitigate this burden, the authors propose a single-loop decentralized bilevel optimization algorithm that incorporates a snapshot mechanism, enabling agents to adaptively skip expensive derivative computations while preserving collaborative learning and dynamically adjusting their local computational load. As the first framework to achieve efficient single-loop updates in decentralized bilevel settings, the method provides rigorous theoretical guarantees on iteration complexity under both ergodic and non-ergodic convergence criteria. Empirical results demonstrate that the algorithm significantly improves computational efficiency on hyperparameter optimization, data cleaning, and meta-learning tasks, while maintaining solution quality comparable to existing approaches.
📝 Abstract
Networked AI systems increasingly rely on multiple agents that collaboratively learn and adapt models over communication networks. In such systems, bilevel formulations naturally arise in hyperparameter optimization, data cleaning, and meta-learning, but the repeated evaluation of gradients, Jacobians, and Hessians can impose a substantial computational burden on individual agents. To address this challenge, we propose Snapshot-SLDBO (S$^3$LDBO), an efficient single-loop decentralized bilevel optimization algorithm that enables agents to intermittently skip expensive derivative evaluations through a snapshot mechanism. This mechanism can be interpreted as an autonomous computation-adaptation strategy for networked AI, where agents selectively perform costly local updates while maintaining global collaborative learning. We establish the ergodic iteration complexity and the high probability nonergodic iteration complexity of the proposed algorithm within a deterministic setting. Experimental results on hyperparameter optimization with synthetic and MNIST datasets, data hyper-cleaning on Fashion-MNIST, and decentralized meta-learning on miniImageNet demonstrate that the proposed algorithm improves computational efficiency while maintaining competitive learning performance.