Escaping Optimization Stagnation: Taking Steps Beyond Task Arithmetic via Difference Vectors

📅 2025-11-22

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing task arithmetic methods suffer from optimization stagnation and high computational overhead, hindering efficient and scalable model editing. To address this, we propose Differential Vectors (DVs) as a generalized form of task vectors and introduce DV-BASI—an anisotropic scaling iterative algorithm that overcomes optimization bottlenecks via directional guidance and escape mechanisms, without introducing auxiliary parameters or modules. DV-BASI integrates differential analysis of task vectors, anisotropic scaling, and iterative perturbation-based optimization, enabling both multi-task merging and single-task fine-tuning under stringent parameter constraints. Under both supervised and unsupervised evaluation protocols, DV-BASI significantly enhances the expressivity and optimization efficiency of task arithmetic, outperforming standalone fine-tuning baselines and achieving state-of-the-art performance.

Technology Category

Application Category

📝 Abstract

Current methods for editing pre-trained models face significant challenges, primarily high computational costs and limited scalability. Task arithmetic has recently emerged as a promising solution, using simple arithmetic operations-addition and negation-based on task vectors which are the differences between fine-tuned and pre-trained model weights, to efficiently modify model behavior. However, the full potential of task arithmetic remains underexplored, primarily due to limited mechanisms for overcoming optimization stagnation. To address this challenge, we introduce the notion of difference vector, a generalized form of task vectors derived from the historical movements during optimization. Using difference vectors as directed perturbations, we propose the Difference Vector-based Anisotropic Scaling Iterative algorithm (DV-BASI) to enable a continuous optimization process for task arithmetic methods without relying on any additional modules or components. Notably, by leveraging escapability and directional advantages of difference vectors, the average performance on different tasks of the multi-task model merged by DV-BASI may even outperform models individually fine-tuned. Based on this observation, we extend the application of difference vectors to a feasible fine-tuning method for single-task models. On the practical side, DV-BASI allows expressive searching directions with few learnable parameters and forms a scalable framework. We also integrate DV-BASI with task arithmetic methods and advanced optimization techniques to achieve state-of-the-art performance on both supervised and unsupervised evaluation protocols.

Problem

Research questions and friction points this paper is trying to address.

Overcoming optimization stagnation in task arithmetic methods for model editing

Reducing computational costs and improving scalability of pre-trained model editing

Enhancing multi-task model performance beyond individually fine-tuned models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Difference vectors enable continuous optimization for task arithmetic

DV-BASI algorithm leverages directional perturbations to escape stagnation

Method achieves state-of-the-art performance without additional components

🔎 Similar Papers

Can Optimization Trajectories Explain Multi-Task Transfer?

2024-08-26arXiv.orgCitations: 1

Authors to Follow