From Parameter to Representation: A Closed-Form Approach for Controllable Model Merging

📅 2025-11-14

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

In multi-task model merging, parameter interference impedes users from flexibly trading off performance across tasks according to personal preferences. Existing “compile-then-query” approaches rely on costly offline multi-objective optimization, whose computational complexity grows exponentially with the number of tasks. Method: We propose a representation correction paradigm that bypasses parameter-space optimization entirely and instead directly rectifies the final-layer representations of merged models. We design a user-preference-aware optimal linear transformation, enabling architecture-agnostic, single-step, closed-form solution. Contribution/Results: Our method reduces computational complexity from exponential to linear in the number of tasks. Experiments demonstrate that it enables instantaneous generation of Pareto-optimal models, achieving superior Pareto frontier quality, more precise preference alignment, and significantly lower computational cost compared to prior methods.

Technology Category

Application Category

📝 Abstract

Model merging combines expert models for multitask performance but faces challenges from parameter interference. This has sparked recent interest in controllable model merging, giving users the ability to explicitly balance performance trade-offs. Existing approaches employ a compile-then-query paradigm, performing a costly offline multi-objective optimization to enable fast, preference-aware model generation. This offline stage typically involves iterative search or dedicated training, with complexity that grows exponentially with the number of tasks. To overcome these limitations, we shift the perspective from parameter-space optimization to a direct correction of the model's final representation. Our approach models this correction as an optimal linear transformation, yielding a closed-form solution that replaces the entire offline optimization process with a single-step, architecture-agnostic computation. This solution directly incorporates user preferences, allowing a Pareto-optimal model to be generated on-the-fly with complexity that scales linearly with the number of tasks. Experimental results show our method generates a superior Pareto front with more precise preference alignment and drastically reduced computational cost.

Problem

Research questions and friction points this paper is trying to address.

Model merging faces parameter interference in multitask performance optimization

Existing approaches require costly offline multi-objective optimization processes

Current methods have exponential complexity growth with task numbers

Innovation

Methods, ideas, or system contributions that make the work stand out.

Shifts from parameter-space to representation-space optimization

Uses closed-form linear transformation for model correction

Enables on-the-fly Pareto-optimal model generation

🔎 Similar Papers

No similar papers found.