Task Arithmetic in Trust Region: A Training-Free Model Merging Approach to Navigate Knowledge Conflicts

📅 2025-01-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address performance degradation in multi-task model fusion caused by knowledge conflicts, this paper proposes a training-free, conflict-aware fusion method. First, knowledge conflict is formally defined, and task-specific trust regions are constructed in parameter space based on gradient orthogonality. Subsequently, task vectors are fused via projection-based addition within these trust regions to mitigate conflicts. The method integrates four key components: task vector modeling, trust region-constrained optimization, loss-gradient orthogonal decomposition, and parameter-space projection fusion. Evaluated on eight benchmark datasets, the approach consistently improves the multi-task average performance of diverse arithmetic fusion strategies—such as averaging, interpolation, and stacking—while preserving both generalization capability and task-specific fidelity. Crucially, it achieves these gains without requiring additional training or fine-tuning.

Technology Category

Application Category

📝 Abstract
Multi-task model merging offers an efficient solution for integrating knowledge from multiple fine-tuned models, mitigating the significant computational and storage demands associated with multi-task training. As a key technique in this field, Task Arithmetic (TA) defines task vectors by subtracting the pre-trained model ($ heta_{ ext{pre}}$) from the fine-tuned task models in parameter space, then adjusting the weight between these task vectors and $ heta_{ ext{pre}}$ to balance task-generalized and task-specific knowledge. Despite the promising performance of TA, conflicts can arise among the task vectors, particularly when different tasks require distinct model adaptations. In this paper, we formally define this issue as knowledge conflicts, characterized by the performance degradation of one task after merging with a model fine-tuned for another task. Through in-depth analysis, we show that these conflicts stem primarily from the components of task vectors that align with the gradient of task-specific losses at $ heta_{ ext{pre}}$. To address this, we propose Task Arithmetic in Trust Region (TATR), which defines the trust region as dimensions in the model parameter space that cause only small changes (corresponding to the task vector components with gradient orthogonal direction) in the task-specific losses. Restricting parameter merging within this trust region, TATR can effectively alleviate knowledge conflicts. Moreover, TATR serves as both an independent approach and a plug-and-play module compatible with a wide range of TA-based methods. Extensive empirical evaluations on eight distinct datasets robustly demonstrate that TATR improves the multi-task performance of several TA-based model merging methods by an observable margin.
Problem

Research questions and friction points this paper is trying to address.

Multi-task Learning
Knowledge Conflict
Model Fusion
Innovation

Methods, ideas, or system contributions that make the work stand out.

Trust Region
Task Arithmetic
Multi-task Learning
🔎 Similar Papers
No similar papers found.
W
Wenju Sun
Key Laboratory of Big Data & Artificial Intelligence in Transportation, Beijing Jiaotong University, 100044, Beijing, China
Q
Qingyong Li
Key Laboratory of Big Data & Artificial Intelligence in Transportation, Beijing Jiaotong University, 100044, Beijing, China
W
Wen Wang
Key Laboratory of Big Data & Artificial Intelligence in Transportation, Beijing Jiaotong University, 100044, Beijing, China
Yangli-ao Geng
Yangli-ao Geng
Beijing Jiaotong University
Machine LearningData MiningUnsupervised Learning
B
Boyang Li
College of Computing and Data Science, Nanyang Technological University, 639798, Singapore