A Randomized Zeroth-Order Hierarchical Framework for Heterogeneous Federated Learning

📅 2025-04-02

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This paper addresses the slow convergence and poor generalization in federated learning (FL) caused by statistical and system heterogeneity. We propose the first gradient-free, gradient-difference-bound-free implicit zeroth-order bilevel optimization framework for heterogeneous FL. Our method formulates heterogeneous FL as a stochastic zeroth-order bilevel optimization problem: the upper level optimizes the global model—supporting server-side pretraining and non-standard aggregation—while the lower level models personalized local training, accommodating heterogeneous numbers of local steps and constraint-aware updates. Theoretically, we establish the first non-asymptotic convergence rate and almost-sure asymptotic convergence guarantee for such a framework. Empirically, our method significantly outperforms state-of-the-art heterogeneous FL approaches on image classification tasks, demonstrating strong robustness to both data distribution shifts and system-level delays.

Technology Category

Application Category

📝 Abstract

Heterogeneity in federated learning (FL) is a critical and challenging aspect that significantly impacts model performance and convergence. In this paper, we propose a novel framework by formulating heterogeneous FL as a hierarchical optimization problem. This new framework captures both local and global training process through a bilevel formulation and is capable of the following: (i) addressing client heterogeneity through a personalized learning framework; (ii) capturing pre-training process on server's side; (iii) updating global model through nonstandard aggregation; (iv) allowing for nonidentical local steps; and (v) capturing clients' local constraints. We design and analyze an implicit zeroth-order FL method (ZO-HFL), provided with nonasymptotic convergence guarantees for both the server-agent and the individual client-agents, and asymptotic guarantees for both the server-agent and client-agents in an almost sure sense. Notably, our method does not rely on standard assumptions in heterogeneous FL, such as the bounded gradient dissimilarity condition. We implement our method on image classification tasks and compare with other methods under different heterogeneous settings.

Problem

Research questions and friction points this paper is trying to address.

Addressing client heterogeneity in federated learning via personalized framework

Formulating heterogeneous FL as hierarchical optimization for improved convergence

Proposing zeroth-order method without relying on bounded gradient assumptions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical optimization for heterogeneous FL

Zeroth-order method with convergence guarantees

Personalized learning and nonstandard aggregation

🔎 Similar Papers

A Hierarchical Federated Learning Approach for the Internet of Things