CAT Merging: A Training-Free Approach for Resolving Conflicts in Model Merging

📅 2025-05-11

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address performance degradation in multi-task model fusion caused by knowledge conflicts, this paper proposes a training-free, conflict-aware task fusion framework. The method explicitly models task vector conflicts at the parameter level: orthogonal projection is applied to linear layer weights to prune conflict-prone components; adaptive masking is designed for normalization layer scale and shift parameters; and task vector decomposition is integrated with layer-specific processing. To our knowledge, this is the first fusion approach that explicitly suppresses conflicts without any training. Extensive experiments demonstrate substantial improvements over state-of-the-art methods such as Task Arithmetic across vision, language, and multimodal benchmarks. Specifically, on ViT-B/32 and ViT-L/14, the proposed method achieves average accuracy gains of 2.5% and 2.0%, respectively.

Technology Category

Application Category

📝 Abstract

Multi-task model merging offers a promising paradigm for integrating multiple expert models into a unified model without additional training. Existing state-of-the-art techniques, such as Task Arithmetic and its variants, merge models by accumulating task vectors -- the parameter differences between pretrained and finetuned models. However, task vector accumulation is often hindered by knowledge conflicts, leading to performance degradation. To address this challenge, we propose Conflict-Aware Task Merging (CAT Merging), a novel training-free framework that selectively trims conflict-prone components from the task vectors. CAT Merging introduces several parameter-specific strategies, including projection for linear weights and masking for scaling and shifting parameters in normalization layers. Extensive experiments on vision, language, and vision-language tasks demonstrate that CAT Merging effectively suppresses knowledge conflicts, achieving average accuracy improvements of up to 2.5% (ViT-B/32) and 2.0% (ViT-L/14) over state-of-the-art methods.

Problem

Research questions and friction points this paper is trying to address.

Resolving knowledge conflicts in multi-task model merging

Improving model merging without additional training

Enhancing accuracy by trimming conflict-prone task vector components

Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free framework for model merging

Selective trimming of conflict-prone components

Parameter-specific strategies for merging

🔎 Similar Papers

No similar papers found.

Authors to Follow