đ¤ AI Summary
Orthogonal fine-tuning methods exhibit strong generalization but suffer from high computational and memory overhead. To address this, we propose HOFT, a Householder-transform-based orthogonal fine-tuning framework, and its scalable variant SHOFT. HOFT pioneers a novel paradigm that parameterizes orthogonal updates via Householder matrices; we theoretically establish its convergence properties and scaling behavior, and further integrate low-rank approximation with gradient-constrained optimization. Empirically, HOFT and SHOFT match or surpass state-of-the-art methodsâincluding LoRA and QLoRAâon diverse tasks: commonsense reasoning, machine translation, topic generation, and mathematical reasoning. They reduce training memory consumption by 37% and inference latency by 29%. Our core contribution is the first efficient orthogonal fine-tuning paradigm that simultaneously achieves strong generalization and substantial efficiency gainsâadvancing both theoretical understanding and practical deployment of orthogonal adaptation in large language models.
đ Abstract
Adaptation of foundation models using low-rank methods is a widespread approach. Another way to adapt these models is to employ orthogonal fine-tuning methods, which are less time and memory efficient despite their good generalization properties. In this work, we propose Householder Orthogonal Fine-tuning (HOFT), a novel orthogonal fine-tuning method that aims to alleviate time and space complexity. Moreover, some theoretical properties of the orthogonal fine-tuning paradigm are explored. From this exploration, Scaled Householder Orthogonal Fine-tuning (SHOFT) is proposed. Both HOFT and SHOFT are evaluated in downstream tasks, namely commonsense reasoning, machine translation, subject-driven generation and mathematical reasoning. Compared with state-of-the-art adaptation methods, HOFT and SHOFT show comparable or better results.