🤖 AI Summary
This work addresses the overlooked asymmetry in the depth-wise dominance of task-oriented and domain-oriented LoRA adapters during fusion. The study is the first to reveal this asymmetric pattern and proposes a training-free fusion algorithm that employs a calibration-probe-guided, layer-wise gating mechanism, combined with subspace-aware LoRA merging and a strategy to eliminate conflicting singular directions, yielding a standard rank-r adapter. Evaluated on Llama-2-7B, the method achieves an average accuracy of 45.2% across six scientific question-answering benchmarks, surpassing DARE-TIES by 3.6 percentage points. On ViT-L/16, it attains an average accuracy of 85.9% across six image classification benchmarks, ranking first on three of them.
📝 Abstract
Combining a task LoRA adapter with a domain LoRA adapter into a single unified model is a practical yet largely unexplored challenge. Existing methods treat both adapters as symmetric peers, applying uniform weights across all layers. We argue that task and domain adapters exhibit a consistent depth-dependent asymmetry across transformer architectures. Domain dominance increases with layer depth, while shallower layers retain stronger task-relevant signals. Motivated by this observation, we propose $\textbf{TaDA}$ ($\textbf{Ta}$sk-$\textbf{D}$omain LoR$\textbf{A}$ Merging), a training-free algorithm that exploits this structure through calibrated probe-guided per-layer gating and per-component subspace-aware merging. The gating assigns individual weights per layer and projection type using a probe signal proved invariant to adapter weight magnitude. The merging discards conflicting singular directions before combining the remaining components. $\textbf{TaDA}$ produces a standard rank-$r$ LoRA adapter with zero inference overhead. On six scientific QA benchmarks with Llama-2-7B, TaDA achieves an average accuracy of 0.452, outperforming DARE-TIES by +3.6 percentage points and obtaining the best result on all six benchmarks. On six image classification benchmarks with ViT-L/16, TaDA reaches 85.9\% average accuracy, improving over the strongest merging baseline while leading in three of the six individual benchmarks.