Non-convex composite federated learning with heterogeneous data

📅 2025-02-06

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

In heterogeneous-data nonconvex composite federated learning, existing methods suffer from high communication overhead, severe client drift, and strong coupling between proximal computation and communication. Method: We propose a decoupled sparse communication framework that separates local proximal operator computation from client-server communication, transmitting only a single $d$-dimensional vector per round. To our knowledge, this is the first such decoupling in the nonconvex nonsmooth composite setting; we design an algorithm based on distributed stochastic proximal gradient descent. Contribution/Results: We establish sublinear convergence under general conditions and linear convergence to a bounded residual under the proximal-PL condition. Experiments on synthetic and real-world datasets demonstrate that our method significantly outperforms state-of-the-art baselines—reducing communication cost, improving convergence stability, and effectively mitigating client drift.

Technology Category

Application Category

📝 Abstract

We propose an innovative algorithm for non-convex composite federated learning that decouples the proximal operator evaluation and the communication between server and clients. Moreover, each client uses local updates to communicate less frequently with the server, sends only a single d-dimensional vector per communication round, and overcomes issues with client drift. In the analysis, challenges arise from the use of decoupling strategies and local updates in the algorithm, as well as from the non-convex and non-smooth nature of the problem. We establish sublinear and linear convergence to a bounded residual error under general non-convexity and the proximal Polyak-Lojasiewicz inequality, respectively. In the numerical experiments, we demonstrate the superiority of our algorithm over state-of-the-art methods on both synthetic and real datasets.

Problem

Research questions and friction points this paper is trying to address.

Addresses non-convex federated learning challenges

Reduces server-client communication frequency

Overcomes client drift in heterogeneous data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decouples proximal operator evaluation

Reduces client-server communication frequency

Achieves sublinear and linear convergence

🔎 Similar Papers

FedPeWS: Personalized Warmup via Subnetworks for Enhanced Heterogeneous Federated Learning