GoRA: Gradient-driven Adaptive Low Rank Adaptation

📅 2025-02-13

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address performance bottlenecks in LoRA fine-tuning caused by fixed rank assignment and static initialization, this paper proposes a gradient-driven paradigm for dynamic rank allocation and joint adapter initialization. It is the first to jointly optimize rank estimation and weight initialization: layer-wise ranks are dynamically allocated based on gradient sensitivity, while adapter weights are initialized via adaptive SVD and refined through a lightweight parameter reweighting mechanism—preserving LoRA’s efficiency while substantially enhancing representational capacity. The method is architecture-agnostic, supporting mainstream models including T5 and Llama3.1. Experiments demonstrate significant gains: +5.88 points over standard LoRA on GLUE, matching full fine-tuning; +5.13 points on GSM8K, and—under high-rank configurations—outperforming full fine-tuning by 2.05 points. These results break the traditional efficiency–effectiveness trade-off in parameter-efficient adaptation.

Technology Category

Application Category

📝 Abstract

Low-Rank Adaptation (LoRA) is a crucial method for efficiently fine-tuning pretrained large language models (LLMs), with its performance largely influenced by two key factors: rank and initialization strategy. Numerous LoRA variants have been proposed to enhance its performance by addressing these factors. However, these variants often compromise LoRA's usability or efficiency. In this paper, we analyze the fundamental limitations of existing methods and introduce a novel approach, GoRA (Gradient-driven Adaptive Low Rank Adaptation), which adaptively assigns ranks and initializes weights for low-rank adapters simultaneously based on gradient information. Extensive experimental results demonstrate that GoRA significantly improves performance while preserving the high usability and efficiency of LoRA. On the T5 model fine-tuned for the GLUE benchmark, GoRA achieves a 5.88-point improvement over LoRA and slightly surpasses full fine-tuning. Similarly, on the Llama3.1-8B-Base model fine-tuned for GSM8k tasks, GoRA outperforms LoRA with a 5.13-point improvement and exceeds full fine-tuning in high-rank settings by a margin of 2.05 points.

Problem

Research questions and friction points this paper is trying to address.

Enhance Low-Rank Adaptation performance

Adaptively assign ranks using gradients

Optimize initialization for low-rank adapters

Innovation

Methods, ideas, or system contributions that make the work stand out.

Gradient-driven adaptation

Dynamic rank assignment

Efficient weight initialization

🔎 Similar Papers

No similar papers found.

Authors to Follow