BoostLoRA: Growing Effective Rank by Boosting Adapters

📅 2026-04-29

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

Existing parameter-efficient fine-tuning methods face a trade-off between adapter size and expressive capacity, with ultra-low-parameter adapters constrained by fixed low-rank subspaces that limit performance. This work proposes BoostLoRA, a novel approach that iteratively trains multiple extremely compact low-rank adapters within a gradient boosting framework, where each iteration focuses on correcting errors from previous rounds. By employing a ROTATE SVD strategy to allocate orthogonal bases across iterations, BoostLoRA enables the effective rank to grow linearly with training progress. This is the first method to achieve dynamic expansion of effective rank in parameter-efficient fine-tuning, decoupling per-iteration parameter cost from overall representational power while incurring no additional inference overhead. On Qwen2.5-3B, it surpasses both full fine-tuning and state-of-the-art ultra-low-parameter methods, achieving 89.1% on GSM8K, 68.8% on MATH-500, 57.2% on MBPP, and 80.4% on HumanEval, and demonstrates cross-architecture transferability on ESM2-650M.

📝 Abstract

Parameter-efficient fine-tuning (PEFT) methods face a tradeoff between adapter size and expressivity: ultra-low-parameter adapters are confined to fixed low-rank subspaces, capping performance even with extended training. We propose BoostLoRA, a gradient-boosting framework that overcomes this limit by iteratively training and merging minimal adapters on the examples the current model gets wrong. A ROTATE SVD basis strategy assigns each round to an orthogonal subspace, so cumulative effective rank grows linearly with the number of rounds while each adapter remains ultra-low-rank. After merging, adapters are discarded, leaving zero inference overhead. On Qwen2.5-3B, BoostLoRA reaches 89.1% on GSM8K and 68.8% on MATH-500, surpassing both the best single-shot ultra-low parameter adapter (TinyLoRA) and full fine-tuning; on code generation it reaches 57.2% on MBPP and 80.4% on HumanEval while full fine-tuning drops below the zero-shot baseline. We also demonstrate cross-architecture transfer on protein binding classification with ESM2-650M and cross-entropy training. BoostLoRA is, to our knowledge, the first PEFT method whose effective rank grows with training, separating per-round parameter cost from total representational capacity.

Problem

Research questions and friction points this paper is trying to address.

parameter-efficient fine-tuning

adapter expressivity

low-rank subspace

effective rank

PEFT

Innovation

Methods, ideas, or system contributions that make the work stand out.

BoostLoRA

parameter-efficient fine-tuning

gradient boosting