🤖 AI Summary
LoRA suffers from structural bottlenecks at high ranks, causing gradient entanglement across input channels, which leads to overfitting and performance saturation—hindering its ability to approximate full fine-tuning (FFT). To address this, we propose Granular Low-Rank Adaptation (GraLoRA), the first sub-block-level low-rank adaptation framework: it partitions the weight matrix into fine-grained blocks and assigns each block an independent low-rank adapter, explicitly decoupling gradient propagation paths. This design incurs virtually zero additional parameters or computational overhead while substantially enhancing representational capacity and FFT approximation fidelity. On HumanEval+, GraLoRA achieves a +8.5% improvement in Pass@1, consistently outperforming LoRA and other PEFT baselines across diverse model scales and rank configurations. Its performance demonstrates strong robustness and scalability.
📝 Abstract
Low-Rank Adaptation (LoRA) is a popular method for parameter-efficient fine-tuning (PEFT) of generative models, valued for its simplicity and effectiveness. Despite recent enhancements, LoRA still suffers from a fundamental limitation: overfitting when the bottleneck is widened. It performs best at ranks 32-64, yet its accuracy stagnates or declines at higher ranks, still falling short of full fine-tuning (FFT) performance. We identify the root cause as LoRA's structural bottleneck, which introduces gradient entanglement to the unrelated input channels and distorts gradient propagation. To address this, we introduce a novel structure, Granular Low-Rank Adaptation (GraLoRA) that partitions weight matrices into sub-blocks, each with its own low-rank adapter. With negligible computational or storage cost, GraLoRA overcomes LoRA's limitations, effectively increases the representational capacity, and more closely approximates FFT behavior. Experiments on code generation and commonsense reasoning benchmarks show that GraLoRA consistently outperforms LoRA and other baselines, achieving up to +8.5% absolute gain in Pass@1 on HumanEval+. These improvements hold across model sizes and rank settings, making GraLoRA a scalable and robust solution for PEFT. Code, data, and scripts are available at https://github.com/SqueezeBits/GraLoRA.git