Boosting Multimodal Federated Learning via Chained Modality Optimization

📅 2026-06-01

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

This work addresses the suboptimal performance of global models in multimodal federated learning caused by modality competition, wherein weaker modalities are often suppressed. To mitigate this issue, the authors propose FedMChain, a novel framework that reformulates training as a chained modality optimization process. Local training is phased to alleviate inter-modality conflicts, and an error-compensation regularizer is introduced to enhance cross-modal complementarity. At the server side, a sparse sign-guided aggregation strategy is employed, which not only improves robustness but also substantially reduces communication overhead. Experimental results demonstrate that FedMChain consistently outperforms existing methods across multiple multimodal benchmarks, achieving higher predictive accuracy while requiring fewer communication rounds.

📝 Abstract

Multimodal Federated Learning (MMFL) enables privacy-preserving collaborative learning across decentralized clients with heterogeneous data and modality availability. However, most existing MMFL methods cast multimodal training as a joint optimization problem, overlooking a key bottleneck: modality competition, where dominant modalities suppress weaker ones and lead to suboptimal global models. To address this, we propose FedMChain, a balanced MMFL framework that structures federated multimodal training as a chain of modality-wise phases. This phase-wise design gives each modality a dedicated local optimization window on multimodal clients to mitigate modality competition, and further promotes cross-modal complementarity via an error-compensated regularizer. On the server side, we employ a sparse sign-guided aggregation strategy that leverages directional sign agreement for robust intra-modality aggregation, avoids destructive averaging, and supports less frequent synchronization to reduce communication overhead. Extensive experiments on multimodal benchmarks demonstrate that FedMChain consistently improves predictive performance while requiring less frequent communication than baselines.

Problem

Research questions and friction points this paper is trying to address.

Multimodal Federated Learning

modality competition

heterogeneous data

privacy-preserving learning

global model optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal Federated Learning

Modality Competition

Chained Optimization