Geometric Latent Reasoning Induces Shorter Generations in LLMs

📅 2026-06-01
📈 Citations: 0
Influential: 0
📄 PDF

career value

172K/year
🤖 AI Summary
This work addresses the high computational cost and output-length sensitivity of large language models that rely on explicit chain-of-thought reasoning. It introduces a novel approach that formulates latent reasoning as a geometric path approximation problem in the pretrained word embedding space. By employing a lightweight transition head to predict iterative update directions of continuous latent states, the method approximates discrete reasoning processes through compressed continuous trajectories, using textual chains of thought as anchor points. Notably, this framework eliminates the need for explicit optimization of reasoning length. Evaluated on the Qwen3 model and mathematical reasoning benchmarks, it substantially reduces generation steps while maintaining or even improving accuracy, thereby uncovering a new trade-off among latent computation, output length, and performance.
📝 Abstract
Large language models solve complex problems by generating lengthy chains of explicit reasoning tokens. While effective, this makes reasoning expensive, length-sensitive, and constrained to (discrete) natural language. While latent reasoning offers a continuous alternative, determining useful structures for intermediate latent states is an open challenge. In this paper, we formulate latent reasoning as a geometric path-approximation problem within the model's pretrained token-embedding space. We introduce Geometric Latent Reasoning (GLR), which uses a lightweight transition head to predict iterative direction updates in embedding space. Using textual chain-of-thought traces as anchors, GLR learns to approximate discrete reasoning trajectories while permitting continuous deviations from exact token embeddings. Evaluations on mathematical reasoning benchmarks using Qwen3 models reveal an emergent phenomenon: geometric latent reasoning induces substantially shorter generations without an explicit length objective. By replacing early explicit reasoning with continuous latent steps, models often reach correct answers using substantially fewer total generation steps. These findings suggest that continuous trajectories act as compact intermediate reasoning states, exposing a new tradeoff between latent computation budget, output length, and accuracy.
Problem

Research questions and friction points this paper is trying to address.

latent reasoning
geometric path approximation
chain-of-thought
continuous reasoning
generation length
Innovation

Methods, ideas, or system contributions that make the work stand out.

Geometric Latent Reasoning
embedding space
continuous reasoning
chain-of-thought
generation length reduction
🔎 Similar Papers
No similar papers found.