Parameter Efficient Merging for Multimodal Large Language Models with Complementary Parameter Adaptation

📅 2025-02-24

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing parameter-efficient fine-tuning (PEFT) methods for merging multimodal large language models (MLLMs) suffer from two critical limitations: degraded multi-task performance and risk of original data leakage. To address these, we propose CoPA-Merging—the first training-free, cross-task generalizable, and data-leakage-free MLLM merging framework. Its core insight is the first identification of *direction preservation* and *singular value compensation* in low-rank decomposition as essential for robust model merging. Guided by this, we design a Complementary Parameter Adaptation mechanism that jointly mitigates task interference and enhances generalization to unseen tasks. The method integrates parameter pruning, relation-driven scaling coefficient construction, and cross-task normalization. Evaluated on a newly established multimodal multi-task benchmark, CoPA-Merging achieves an average 12.3% performance gain over state-of-the-art merging approaches, significantly improving generalization—without accessing original training data or requiring any additional optimization.

Technology Category

Application Category

📝 Abstract

Fine-tuning pre-trained models with custom data leads to numerous expert models on specific tasks. Merging models into one universal model to empower multi-task ability refraining from data leakage has gained popularity. With the expansion in data and model size, parameter efficient tuning becomes the common practice for obtaining task-specific models efficiently. However, we observe that existing methods designed for full fine-tuning merging fail under efficient tuning. To address the issues, we analyze from low-rank decomposition and reveal that maintaining direction and compensating for gap between singular values are crucial for efficient model merging. Consequently, we propose CoPA-Merging, a training-free parameter efficient merging method with complementary parameter adaptation. Specifically, we (1) prune parameters and construct scaling coefficients from inter-parameter relation to compensate for performance drop from task interference and (2) perform cross-task normalization to enhance unseen task generalization. We establish a benchmark consisting of diverse multimodal tasks, on which we conduct experiments to certificate the outstanding performance and generalizability of our method. Additional study and extensive analyses further showcase the effectiveness.

Problem

Research questions and friction points this paper is trying to address.

Efficient merging of multimodal large language models

Addressing task interference in model merging

Enhancing generalization for unseen tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Parameter efficient tuning

Complementary parameter adaptation

Cross-task normalization

🔎 Similar Papers

No similar papers found.

Authors to Follow