GradualDiff-Fed: A Federated Learning Specialized Framework for Large Language Model

📅 2025-06-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the triple challenges of high communication overhead, unwieldy parameter scale, and stringent data privacy requirements in domain-specific fine-tuning of large language models (LLMs) under federated learning (FL), this paper proposes FedDelta—a lightweight federated fine-tuning framework based on differential weight transmission. FedDelta transmits only the weight deltas (ΔW) between local and global models, augmented with sparsification and quantization for compression, and employs an asynchronous-synchronous hybrid update strategy to mitigate system heterogeneity. Experiments across multiple domain-specific benchmarks (legal, medical, financial) demonstrate that FedDelta reduces communication volume by 68%–89% while matching centralized fine-tuning performance (average accuracy gap <0.8%). Moreover, it enables stable training of LLMs with up to 100 layers. This work provides a scalable, low-overhead, and privacy-preserving solution for efficient collaborative LLM fine-tuning in privacy-sensitive settings.

Technology Category

Application Category

📝 Abstract
The rapid proliferation of large language models (LLMs) has created an unprecedented demand for fine-tuning models for specialized domains, such as medical science. While federated learning (FL) offers a decentralized and privacy-preserving approach to collaboratively fine-tune LLMs without sharing raw data, it presents significant challenges, particularly in performance and managing large model sizes efficiently. In this paper, we introduce GradualDiff-Fed, an FL framework designed explicitly for LLMs, and their challenge of handling the high parameter size. GradualDiff-Fed reduces communication costs by transmitting only the difference of model weights rather than the entire model during training rounds. Such an approach significantly improves scalability and communication efficiency, making it more feasible to fine-tune LLMs across distributed clients without compromising performance. Our evaluation demonstrates that GradualDiff-Fed achieves performance on par with centralized training while drastically reducing communication overhead. These results highlight the potential of GradualDiff-Fed as an efficient solution for fine-tuning large models from distributed data in privacy-preserving settings without comprising performance.
Problem

Research questions and friction points this paper is trying to address.

Efficient federated learning for large language models
Reducing communication costs in decentralized model training
Privacy-preserving fine-tuning without performance compromise
Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated Learning for large language models
Transmits only weight differences, not full model
Maintains performance while reducing communication costs
🔎 Similar Papers
No similar papers found.