π€ AI Summary
To address the high resource overhead, slow convergence, and susceptibility to local optima in Federated Fine-Tuning (FFT) on edge devices, this paper proposes Developmental Federated Tuning (DevFT)βthe first progressive fine-tuning framework for federated learning inspired by human cognitive development. DevFT employs staged knowledge transfer and dynamic architectural expansion, synergistically integrating conflict-averse layer grouping with gradient-driven layer fusion to enable efficient inter-client knowledge distillation and construction of representative layers. Experiments across multiple benchmarks demonstrate that DevFT accelerates convergence by 4.59Γ on average, reduces communication overhead by 10.67Γ, and improves model performance by 9.07%. These gains significantly enhance feasibility for edge deployment while maintaining full compatibility with existing federated learning methods.
π Abstract
Federated fine-tuning enables Large Language Models (LLMs) to adapt to downstream tasks while preserving data privacy, but its resource-intensive nature limits deployment on edge devices. In this paper, we introduce Developmental Federated Tuning (DevFT), a resource-efficient approach inspired by cognitive development that progressively builds a powerful LLM from a compact foundation. DevFT decomposes the fine-tuning process into developmental stages, each optimizing submodels with increasing parameter capacity. Knowledge from earlier stages transfers to subsequent submodels, providing optimized initialization parameters that prevent convergence to local minima and accelerate training. This paradigm mirrors human learning, gradually constructing comprehensive knowledge structure while refining existing skills. To efficiently build stage-specific submodels, DevFT introduces deconfliction-guided layer grouping and differential-based layer fusion to distill essential information and construct representative layers. Evaluations across multiple benchmarks demonstrate that DevFT significantly outperforms state-of-the-art methods, achieving up to 4.59$ imes$ faster convergence, 10.67$ imes$ reduction in communication overhead, and 9.07% average performance improvement, while maintaining compatibility with existing approaches.