Dynamic Context-oriented Decomposition for Task-aware Low-rank Adaptation with Less Forgetting and Faster Convergence

📅 2025-06-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional low-rank adaptation (LoRA) neglects input context, leading to severe world-knowledge forgetting and slow convergence during fine-tuning. To address this, we propose a task-aware, context-guided low-rank adaptation framework. Our method introduces three key innovations: (1) context-guided SVD initialization via dynamic covariance selection; (2) a dual-paradigm design comprising Knowledge Preservation Mode (KPM) and Instruction Preview Mode (IPM); and (3) a principal-component compactness-driven dynamic rank allocation mechanism. KPM significantly outperforms LoRA while mitigating knowledge forgetting; IPM achieves 4.5× faster convergence than QLoRA. Extensive experiments demonstrate consistent superiority over state-of-the-art PEFT baselines across both LLMs and VLMs. Our implementation is fully integrated into the official Hugging Face PEFT library.

Technology Category

Application Category

📝 Abstract
Conventional low-rank adaptation methods build adapters without considering data context, leading to sub-optimal fine-tuning performance and severe forgetting of inherent world knowledge. In this paper, we propose context-oriented decomposition adaptation (CorDA), a novel method that initializes adapters in a task-aware manner. Concretely, we develop context-oriented singular value decomposition, where we collect covariance matrices of input activations for each linear layer using sampled data from the target task, and apply SVD to the product of weight matrix and its corresponding covariance matrix. By doing so, the task-specific capability is compacted into the principal components. Thanks to the task awareness, our method enables two optional adaptation modes, knowledge-preserved mode (KPM) and instruction-previewed mode (IPM), providing flexibility to choose between freezing the principal components to preserve their associated knowledge or adapting them to better learn a new task. We further develop CorDA++ by deriving a metric that reflects the compactness of task-specific principal components, and then introducing dynamic covariance selection and dynamic rank allocation strategies based on the same metric. The two strategies provide each layer with the most representative covariance matrix and a proper rank allocation. Experimental results show that CorDA++ outperforms CorDA by a significant margin. CorDA++ in KPM not only achieves better fine-tuning performance than LoRA, but also mitigates the forgetting of pre-trained knowledge in both large language models and vision language models. For IPM, our method exhibits faster convergence, emph{e.g.,} 4.5x speedup over QLoRA, and improves adaptation performance in various scenarios, outperforming strong baseline methods. Our method has been integrated into the PEFT library developed by Hugging Face.
Problem

Research questions and friction points this paper is trying to address.

Improves low-rank adaptation with task-aware context
Reduces forgetting of pre-trained knowledge in models
Enhances convergence speed and adaptation performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Context-oriented SVD for task-aware adaptation
Dynamic covariance selection and rank allocation
Optional knowledge-preserved or instruction-previewed modes
Y
Yibo Yang
King Abdullah University of Science and Technology, Saudi Arabia
Sihao Liu
Sihao Liu
UCLA
Computer ArchitectureVLSICPUFPGA
C
Chuan Rao
King Abdullah University of Science and Technology, Saudi Arabia
Bang An
Bang An
University of Maryland, College Park
Machine LearningNatural Language Processing
T
Tiancheng Shen
University of California, Merced, USA
P
Philip H.S. Torr
University of Oxford, U.K.
Ming-Hsuan Yang
Ming-Hsuan Yang
University of California at Merced; Google DeepMind
Computer VisionMachine LearningArtificial Intelligence
Bernard Ghanem
Bernard Ghanem
Professor, King Abdullah University of Science and Technology
computer visionmachine learning