Federated Stochastic Minimax Optimization under Heavy-Tailed Noises

📅 2025-11-06

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This paper studies nonconvex–PL-type minimax optimization in federated learning under heavy-tailed gradient noise—i.e., gradients with unbounded variance and non-Gaussian distributions. To overcome the limitation of conventional algorithms that rely on bounded-variance assumptions, we propose two novel robust federated minimax algorithms with provable convergence guarantees under heavy tails: FedRNM, which employs normalized stochastic gradients, and FedRMMU, which integrates the Muon optimizer. Both methods incorporate robust local updates and global aggregation mechanisms to mitigate the impact of heavy-tailed noise. We establish a convergence rate of $Oig((TNp)^{-frac{s-1}{2s}}ig)$, where $s>1$ is the tail index characterizing the gradient distribution—significantly generalizing existing convergence analyses for federated minimax optimization. Extensive experiments demonstrate that our algorithms achieve superior robustness and stability over baseline methods under heavy-tailed noise settings.

Technology Category

Application Category

📝 Abstract

Heavy-tailed noise has attracted growing attention in nonconvex stochastic optimization, as numerous empirical studies suggest it offers a more realistic assumption than standard bounded variance assumption. In this work, we investigate nonconvex-PL minimax optimization under heavy-tailed gradient noise in federated learning. We propose two novel algorithms: Fed-NSGDA-M, which integrates normalized gradients, and FedMuon-DA, which leverages the Muon optimizer for local updates. Both algorithms are designed to effectively address heavy-tailed noise in federated minimax optimization, under a milder condition. We theoretically establish that both algorithms achieve a convergence rate of $O({1}/{(TNp)^{frac{s-1}{2s}}})$. To the best of our knowledge, these are the first federated minimax optimization algorithms with rigorous theoretical guarantees under heavy-tailed noise. Extensive experiments further validate their effectiveness.

Problem

Research questions and friction points this paper is trying to address.

Addressing heavy-tailed gradient noise in federated minimax optimization

Developing algorithms for nonconvex-PL minimax problems under heavy-tailed noise

Achieving convergence guarantees for federated minimax optimization with heavy-tailed noise

Innovation

Methods, ideas, or system contributions that make the work stand out.

Normalized gradients address heavy-tailed noise

Muon optimizer enhances local federated updates

Convergence guarantees under milder noise conditions

🔎 Similar Papers

No similar papers found.

Authors to Follow