🤖 AI Summary
This work addresses the limitation of fixed-rank constraints in parameter-efficient fine-tuning, which fail to accommodate the heterogeneous rank requirements across different layers of neural networks. The authors propose LR-LoRA, a novel approach that introduces a learnable rank mechanism within the LoRA framework, enabling differentiable and dynamic optimization of the rank for each adapter layer. This method reveals a systematic disparity in rank demands between attention and MLP layers in Transformers, thereby providing a more flexible and effective inductive bias. Experimental results demonstrate that LR-LoRA significantly outperforms existing parameter-efficient fine-tuning methods across multiple benchmarks for language understanding and commonsense reasoning, achieving state-of-the-art performance.
📝 Abstract
Low-Rank Adaptation (LoRA) is a popular parameter-efficient fine-tuning (PEFT) method that restricts weight updates to low-rank adapters, introducing a fixed low-rank inductive bias by optimizing in a low-dimensional subspace. In this work, we question whether a fixed-rank constraint is the most effective inductive bias for parameter-efficient fine-tuning. We introduce *Learnable Rank LoRA (LR-LoRA)*, a PEFT method in which the adapter rank is learned during the training process. Instead of prescribing a uniform rank for all adapter layers, LR-LoRA allows the optimizer to determine the appropriate rank for each layer. Using this approach, we find substantial layer-wise variation in the learned ranks, with the attention and MLP layers in the transformer models exhibiting systematically different rank preferences. Across a range of language understanding and commonsense reasoning benchmarks, LR-LoRA achieves state-of-the-art performance in most settings and consistently outperforms strong PEFT baselines, demonstrating that a learnable rank provides a more flexible and effective inductive bias than fixed-rank adaptations.