Finer Parameter Steps for Low-Rank PEFT: A Controlled Study with CP Tensor Adapters

📅 2026-05-29

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This work addresses the coarse parameter granularity of low-rank adapters such as LoRA, which hinders fine-grained exploration of the trade-off between accuracy and parameter count under tight budgets. To overcome this limitation, the study introduces canonical polyadic (CP) tensor decomposition into parameter-efficient fine-tuning (PEFT) for the first time, proposing the CP adapter. With a fixed decomposition structure, CP adapters achieve approximately 21× finer parameter granularity than LoRA, effectively filling the capacity gap in low-budget regimes. Experiments on the OPT-1.3B model across standard NLP benchmarks—including SST-2, RTE, and BoolQ—demonstrate that CP adapters train stably and access parameter-efficient regions unreachable by LoRA. Performance varies by task: SST-2 exhibits early saturation, BoolQ shows slight initial gains but ultimately underperforms LoRA, and RTE remains best served by LoRA.

📝 Abstract

Low-rank adapters are usually compared by sweeping a small set of ranks, but the rank also fixes the resolution of the parameter budget. For a $2048{\times}2048$ OPT attention projection, increasing LoRA by one rank stores $4096$ trainable scalars, leaving large gaps between feasible low-budget adapter sizes. This paper asks whether a tensorized adapter with finer capacity increments changes the observed accuracy--budget trade-off. We instantiate this question with fixed-component canonical polyadic (CP) tensor adapters. Under a $32{\times}64{\times}32{\times}64$ tensorization, one normalized CP component stores $193$ trainable scalars per projection, about $21$ times smaller than one LoRA rank step. We compare CP adapters and LoRA on OPT-1.3B across SST-2, RTE, and BoolQ under matched target modules, training protocol, data caps, and seed schedules. CP trains stably and fills the gaps between LoRA ranks, but the effect is task-dependent: SST-2 reaches an early low-budget plateau, BoolQ benefits from additional CP components before saturating slightly below LoRA, and RTE remains LoRA-favored. Finer parameter steps are therefore useful for diagnosing PEFT budget sensitivity, but they do not by themselves guarantee a better accuracy--budget curve.

Problem

Research questions and friction points this paper is trying to address.

low-rank PEFT

parameter budget

fine-grained capacity

accuracy-budget trade-off

tensorized adapters

Innovation

Methods, ideas, or system contributions that make the work stand out.

CP tensor adapters

low-rank PEFT

fine-grained parameter budget