CAT Merging: A Training-Free Approach for Resolving Conflicts in Model Merging

📅 2025-05-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address performance degradation in multi-task model fusion caused by knowledge conflicts, this paper proposes a training-free, conflict-aware task fusion framework. The method explicitly models task vector conflicts at the parameter level: orthogonal projection is applied to linear layer weights to prune conflict-prone components; adaptive masking is designed for normalization layer scale and shift parameters; and task vector decomposition is integrated with layer-specific processing. To our knowledge, this is the first fusion approach that explicitly suppresses conflicts without any training. Extensive experiments demonstrate substantial improvements over state-of-the-art methods such as Task Arithmetic across vision, language, and multimodal benchmarks. Specifically, on ViT-B/32 and ViT-L/14, the proposed method achieves average accuracy gains of 2.5% and 2.0%, respectively.

Technology Category

Application Category

📝 Abstract
Multi-task model merging offers a promising paradigm for integrating multiple expert models into a unified model without additional training. Existing state-of-the-art techniques, such as Task Arithmetic and its variants, merge models by accumulating task vectors -- the parameter differences between pretrained and finetuned models. However, task vector accumulation is often hindered by knowledge conflicts, leading to performance degradation. To address this challenge, we propose Conflict-Aware Task Merging (CAT Merging), a novel training-free framework that selectively trims conflict-prone components from the task vectors. CAT Merging introduces several parameter-specific strategies, including projection for linear weights and masking for scaling and shifting parameters in normalization layers. Extensive experiments on vision, language, and vision-language tasks demonstrate that CAT Merging effectively suppresses knowledge conflicts, achieving average accuracy improvements of up to 2.5% (ViT-B/32) and 2.0% (ViT-L/14) over state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

Resolving knowledge conflicts in multi-task model merging
Improving model merging without additional training
Enhancing accuracy by trimming conflict-prone task vector components
Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free framework for model merging
Selective trimming of conflict-prone components
Parameter-specific strategies for merging
🔎 Similar Papers
No similar papers found.
W
Wenju Sun
Key Laboratory of Big Data & Artificial Intelligence in Transportation (Ministry of Education), Beijing Jiaotong University, Beijing, China
Q
Qingyong Li
Key Laboratory of Big Data & Artificial Intelligence in Transportation (Ministry of Education), Beijing Jiaotong University, Beijing, China
Yangli-ao Geng
Yangli-ao Geng
Beijing Jiaotong University
Machine LearningData MiningUnsupervised Learning
B
Boyang Li
College of Computing and Data Science, Nanyang Technological University, Singapore