Towards Efficient Model-Heterogeneity Federated Learning for Large Models

📅 2024-11-25

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address inefficient training and excessive communication overhead in deploying large language models (LLMs) on resource-heterogeneous edge clients, this paper proposes FedAdapter—a lightweight, parameter-efficient fine-tuning framework for federated learning of heterogeneous LLMs. Methodologically, it introduces a multi-branch adapter architecture enabling scalable and structure-aware local fine-tuning across diverse model sizes and architectures; and a cross-model aggregator that facilitates effective knowledge fusion among heterogeneous models. Evaluated on multiple CV and NLP benchmark tasks, FedAdapter achieves state-of-the-art performance while substantially reducing communication costs (average reduction of 42%) and local computational overhead (FLOPs reduced by 38%), without compromising accuracy. The framework thus establishes a scalable, highly compatible paradigm for federated LLM training on resource-constrained edge devices.

Technology Category

Application Category

📝 Abstract

As demand grows for complex tasks and high-performance applications in edge computing, the deployment of large models in federated learning has become increasingly urgent, given their superior representational power and generalization capabilities. However, the resource constraints and heterogeneity among clients present significant challenges to this deployment. To tackle these challenges, we introduce HeteroTune, an innovative fine-tuning framework tailored for model-heterogeneity federated learning (MHFL). In particular, we propose a novel parameter-efficient fine-tuning (PEFT) structure, called FedAdapter, which employs a multi-branch cross-model aggregator to enable efficient knowledge aggregation across diverse models. Benefiting from the lightweight FedAdapter, our approach significantly reduces both the computational and communication overhead. Finally, our approach is simple yet effective, making it applicable to a wide range of large model fine-tuning tasks. Extensive experiments on computer vision (CV) and natural language processing (NLP) tasks demonstrate that our method achieves state-of-the-art results, seamlessly integrating efficiency and performance.

Problem

Research questions and friction points this paper is trying to address.

Efficient federated learning for large heterogeneous models

Addressing client resource heterogeneity in compute and memory

Reducing communication overhead and memory usage while improving performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

DeMA architecture for heterogeneous model aggregation

CMGA mechanism for gradient alignment

Reduces communication and memory usage significantly

🔎 Similar Papers

No similar papers found.

Authors to Follow