FLEX-MoE: Federated Mixture-of-Experts with Load-balanced Expert Assignment

📅 2025-12-28

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

In federated learning, resource-constrained edge devices and non-IID data exacerbate two critical challenges in Mixture-of-Experts (MoE) models: excessive expert storage overhead and severe load imbalance across clients. To address these, we propose a client-expert fitness scoring mechanism that jointly optimizes expert assignment and global load balancing. Our approach comprises three core components: (i) a lightweight fitness modeling framework driven by training feedback; (ii) a constraint-optimized expert allocation algorithm; and (iii) a local expert subset deployment strategy. This is the first work to simultaneously ensure personalized model capability and system-level load fairness in federated MoE, overcoming the limitations of conventional greedy assignment heuristics. Evaluated on three benchmark datasets, our method achieves an average accuracy improvement of 2.7%, reduces expert utilization variance by 63%, and significantly enhances convergence stability and efficiency under heterogeneous device conditions.

Technology Category

Application Category

📝 Abstract

Mixture-of-Experts (MoE) models enable scalable neural networks through conditional computation. However, their deployment with federated learning (FL) faces two critical challenges: 1) resource-constrained edge devices cannot store full expert sets, and 2) non-IID data distributions cause severe expert load imbalance that degrades model performance. To this end, we propose extbf{FLEX-MoE}, a novel federated MoE framework that jointly optimizes expert assignment and load balancing under limited client capacity. Specifically, our approach introduces client-expert fitness scores that quantify the expert suitability for local datasets through training feedback, and employs an optimization-based algorithm to maximize client-expert specialization while enforcing balanced expert utilization system-wide. Unlike existing greedy methods that focus solely on personalization while ignoring load imbalance, our FLEX-MoE is capable of addressing the expert utilization skew, which is particularly severe in FL settings with heterogeneous data. Our comprehensive experiments on three different datasets demonstrate the superior performance of the proposed FLEX-MoE, together with its ability to maintain balanced expert utilization across diverse resource-constrained scenarios.

Problem

Research questions and friction points this paper is trying to address.

Federated learning with Mixture-of-Experts faces resource constraints on edge devices.

Non-IID data causes expert load imbalance, degrading model performance.

The framework jointly optimizes expert assignment and load balancing under client capacity limits.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated MoE framework with client-expert fitness scores

Optimization algorithm for expert assignment and load balancing

Addresses non-IID data and resource constraints in edge devices

🔎 Similar Papers

On Expert Estimation in Hierarchical Mixture of Experts: Beyond Softmax Gating Functions