LoRi: Low-Rank Distillation for Implicit Reasoning

📅 2026-06-03
📈 Citations: 0
Influential: 0
📄 PDF

career value

169K/year
🤖 AI Summary
Implicit Chain-of-Thought (iCoT) struggles to effectively internalize complex reasoning processes in large language models, significantly underperforming explicit Chain-of-Thought (CoT). This work proposes a low-rank distillation framework that, for the first time, uncovers and exploits the intrinsic low-rank structure of reasoning trajectories. By aligning both first- and second-order statistics of teacher and student hidden states within a shared low-rank tensor subspace, the method enables compact modeling of global reasoning structures. The approach facilitates cross-architecture knowledge transfer and substantially enhances mathematical reasoning capabilities across models such as LLaMA and Qwen. Notably, it achieves accuracy on multi-step problems approaching that of explicit CoT, outperforming existing iCoT distillation techniques.
📝 Abstract
Implicit chain-of-thought (iCoT) methods aim to internalize reasoning in large language models, but often underperform explicit CoT prompting. We empirically find that hidden-state reasoning trajectories exhibit low-rank structure. Motivated by this observation, we propose a low-rank distillation framework that transfers reasoning by aligning teacher and student trajectories in a shared low-rank tensor subspace using first- and second-order statistics. The resulting formulation captures the global structure of reasoning while supporting a compact latent reasoning process. We evaluate the method across multiple model families, including LLaMA and Qwen, at different scales on mathematical reasoning benchmarks. Our approach consistently improves performance, especially on challenging multi-step tasks, approaching explicit CoT accuracy and outperforming prior iCoT distillation methods.
Problem

Research questions and friction points this paper is trying to address.

implicit chain-of-thought
reasoning
large language models
low-rank structure
distillation
Innovation

Methods, ideas, or system contributions that make the work stand out.

low-rank distillation
implicit chain-of-thought
reasoning trajectory
tensor subspace alignment
knowledge distillation