Efficient Zero-Order Federated Finetuning of Language Models for Resource-Constrained Devices

📅 2025-02-14

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

To address memory, communication, and computational bottlenecks in federated fine-tuning of large language models (LLMs) on resource-constrained edge devices, this paper proposes Federated Split Perturbation Zeroth-Order Optimization (FedSPZO). FedSPZO is the first method to adaptively allocate zeroth-order gradient estimation perturbations per network module based on architectural characteristics, integrating task-aligned and model-block-wise perturbation strategies. It achieves accelerated convergence with inference-only memory overhead. Compared to state-of-the-art zeroth-order federated methods, FedSPZO reduces computational cost by 2.5–7×, enabling efficient and privacy-preserving LLM fine-tuning on low-power edge devices. This work bridges a critical gap between stringent resource constraints and high training performance in edge-based federated learning.

Technology Category

Application Category

📝 Abstract

Federated fine-tuning offers a promising approach for tuning Large Language Models (LLMs) on edge devices while preserving data privacy. However, fine-tuning these models on edge devices remains challenging due to high memory, communication, and computational demands. Zero-order optimization with task alignment provides a potential solution, enabling fine-tuning with inference-level memory requirements but requires a longer convergence time. In this paper, we propose Federated Split-Perturbation Zero-order Optimization (FedSPZO) that divides the network into two blocks, applying a different number of perturbations per block in a computationally effective way, achieving faster convergence. Our evaluation shows a $2.5 - 7 imes $ reduction in computation overhead compared to zero-order state of the art techniques in federated learning.

Problem

Research questions and friction points this paper is trying to address.

Optimize LLMs on edge devices

Reduce memory and computational demands

Enhance convergence speed in federated learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-order optimization

Federated Split-Perturbation

Reduced computation overhead

🔎 Similar Papers

No similar papers found.