Task Arithmetic Through The Lens Of One-Shot Federated Learning

📅 2024-11-27
🏛️ arXiv.org
📈 Citations: 7
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the causes of performance disparities in Task Arithmetic (TA) for multi-task model fusion, with a focus on the impact mechanisms of data and training heterogeneity. Method: We establish, for the first time, a theoretical equivalence between TA and Federated Averaging (FedAvg), framing multi-task fusion as a one-shot federated learning problem. Leveraging FedAvg’s convergence analysis, we develop a heterogeneity attribution framework that quantifies the contribution of each source of heterogeneity to fusion bias. We further adapt and enhance multiple federated optimization algorithms to enable robust, weighted aggregation in weight space. Results: Experiments demonstrate that our approach significantly outperforms standard TA across multiple multi-task benchmarks, while maintaining strong generalization and stability under high heterogeneity. The method provides a novel, interpretable, and controllable paradigm for model fusion.

Technology Category

Application Category

📝 Abstract
Task Arithmetic is a model merging technique that enables the combination of multiple models' capabilities into a single model through simple arithmetic in the weight space, without the need for additional fine-tuning or access to the original training data. However, the factors that determine the success of Task Arithmetic remain unclear. In this paper, we examine Task Arithmetic for multi-task learning by framing it as a one-shot Federated Learning problem. We demonstrate that Task Arithmetic is mathematically equivalent to the commonly used algorithm in Federated Learning, called Federated Averaging (FedAvg). By leveraging well-established theoretical results from FedAvg, we identify two key factors that impact the performance of Task Arithmetic: data heterogeneity and training heterogeneity. To mitigate these challenges, we adapt several algorithms from Federated Learning to improve the effectiveness of Task Arithmetic. Our experiments demonstrate that applying these algorithms can often significantly boost performance of the merged model compared to the original Task Arithmetic approach. This work bridges Task Arithmetic and Federated Learning, offering new theoretical perspectives on Task Arithmetic and improved practical methodologies for model merging.
Problem

Research questions and friction points this paper is trying to address.

Understanding factors affecting Task Arithmetic success
Linking Task Arithmetic to Federated Learning equivalence
Improving model merging via Federated Learning algorithms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Task Arithmetic as one-shot Federated Learning
Leveraging FedAvg for model merging
Adapting FL algorithms to improve performance
🔎 Similar Papers
No similar papers found.