Towards Higher Effective Rank in Parameter-efficient Fine-tuning using Khatri--Rao Product

📅 2025-07-31

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

LoRA’s low-rank assumption in parameter-efficient fine-tuning (PEFT) limits its capacity to approximate weight update matrices with high effective rank—e.g., spectrally flat or high-frequency–rich structures—thereby hindering performance on multimodal and large language models (LLMs). To address this, we propose KRAdapter, the first PEFT method leveraging the Khatri-Rao product to induce tensor-structured adapter weights. This enables high effective-rank approximations with minimal parameter overhead, overcoming LoRA’s expressivity bottleneck. KRAdapter preserves linear computational and memory complexity, and is compatible with both vision-language models and LLMs (validated up to 8B parameters). On synthetic spectral analysis benchmarks and unseen commonsense reasoning tasks, it significantly outperforms state-of-the-art PEFT methods. Crucially, KRAdapter achieves a superior trade-off among accuracy, parameter efficiency, and inference latency.

Technology Category

Application Category

📝 Abstract

Parameter-efficient fine-tuning (PEFT) has become a standard approach for adapting large pre-trained models. Amongst PEFT methods, low-rank adaptation (LoRA) has achieved notable success. However, recent studies have highlighted its limitations compared against full-rank alternatives, particularly when applied to multimodal and large language models. In this work, we present a quantitative comparison amongst full-rank and low-rank PEFT methods using a synthetic matrix approximation benchmark with controlled spectral properties. Our results confirm that LoRA struggles to approximate matrices with relatively flat spectrums or high frequency components -- signs of high effective ranks. To this end, we introduce KRAdapter, a novel PEFT algorithm that leverages the Khatri-Rao product to produce weight updates, which, by construction, tends to produce matrix product with a high effective rank. We demonstrate performance gains with KRAdapter on vision-language models up to 1B parameters and on large language models up to 8B parameters, particularly on unseen common-sense reasoning tasks. In addition, KRAdapter maintains the memory and compute efficiency of LoRA, making it a practical and robust alternative to fine-tune billion-scale parameter models.

Problem

Research questions and friction points this paper is trying to address.

LoRA struggles with high effective rank matrices

KRAdapter improves performance on large language models

KRAdapter maintains efficiency like LoRA

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Khatri-Rao product for high-rank updates

Maintains LoRA's memory and compute efficiency

Improves performance on large vision-language models

🔎 Similar Papers

LoRTA: Low Rank Tensor Adaptation of Large Language Models