GoRA: Gradient-driven Adaptive Low Rank Adaptation

📅 2025-02-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address performance bottlenecks in LoRA fine-tuning caused by fixed rank assignment and static initialization, this paper proposes a gradient-driven paradigm for dynamic rank allocation and joint adapter initialization. It is the first to jointly optimize rank estimation and weight initialization: layer-wise ranks are dynamically allocated based on gradient sensitivity, while adapter weights are initialized via adaptive SVD and refined through a lightweight parameter reweighting mechanism—preserving LoRA’s efficiency while substantially enhancing representational capacity. The method is architecture-agnostic, supporting mainstream models including T5 and Llama3.1. Experiments demonstrate significant gains: +5.88 points over standard LoRA on GLUE, matching full fine-tuning; +5.13 points on GSM8K, and—under high-rank configurations—outperforming full fine-tuning by 2.05 points. These results break the traditional efficiency–effectiveness trade-off in parameter-efficient adaptation.

Technology Category

Application Category

📝 Abstract
Low-Rank Adaptation (LoRA) is a crucial method for efficiently fine-tuning pretrained large language models (LLMs), with its performance largely influenced by two key factors: rank and initialization strategy. Numerous LoRA variants have been proposed to enhance its performance by addressing these factors. However, these variants often compromise LoRA's usability or efficiency. In this paper, we analyze the fundamental limitations of existing methods and introduce a novel approach, GoRA (Gradient-driven Adaptive Low Rank Adaptation), which adaptively assigns ranks and initializes weights for low-rank adapters simultaneously based on gradient information. Extensive experimental results demonstrate that GoRA significantly improves performance while preserving the high usability and efficiency of LoRA. On the T5 model fine-tuned for the GLUE benchmark, GoRA achieves a 5.88-point improvement over LoRA and slightly surpasses full fine-tuning. Similarly, on the Llama3.1-8B-Base model fine-tuned for GSM8k tasks, GoRA outperforms LoRA with a 5.13-point improvement and exceeds full fine-tuning in high-rank settings by a margin of 2.05 points.
Problem

Research questions and friction points this paper is trying to address.

Enhance Low-Rank Adaptation performance
Adaptively assign ranks using gradients
Optimize initialization for low-rank adapters
Innovation

Methods, ideas, or system contributions that make the work stand out.

Gradient-driven adaptation
Dynamic rank assignment
Efficient weight initialization
🔎 Similar Papers
No similar papers found.
H
Haonan He
University of Science and Technology of China, Institute of Intelligent Machines, HFIPS, Chinese Academy of Sciences
P
Peng Ye
The Chinese University of Hong Kong, Shanghai Artificial Intelligence Laboratory, Fudan University
Yuchen Ren
Yuchen Ren
Renmin University of China
Y
Yuan Yuan
Institute of Intelligent Machines, HFIPS, Chinese Academy of Sciences
L
Lei Chen
Institute of Intelligent Machines, HFIPS, Chinese Academy of Sciences