Latent Planning via Embedding Arithmetic: A Contrastive Approach to Strategic Reasoning

๐Ÿ“… 2025-11-12
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses strategic reasoning in high-dimensional decision spaces by proposing a novel planning paradigm that eschews policy/value networks and explicit dynamical models: lightweight vector-arithmetic planning directly within a semantically aligned embedding space. The method constructs an evaluation embedding space via supervised contrastive learning, ensuring semantic similarity of outcomes correlates with Euclidean distance in the embedding space; introduces a global advantage vector to rank actions; and formulates planning as a vector alignment problem in the latent space. Experiments demonstrate that shallow search suffices to achieve strong adversarial performance in chess, significantly reducing planning computational overhead. The core contribution is the first decoupling of strategic planning into a geometric alignment problem in embedding spaceโ€”yielding a scalable, interpretable pathway for autonomous reasoning in large language models.

Technology Category

Application Category

๐Ÿ“ Abstract
Planning in high-dimensional decision spaces is increasingly being studied through the lens of learned representations. Rather than training policies or value heads, we investigate whether planning can be carried out directly in an evaluation-aligned embedding space. We introduce SOLIS, which learns such a space using supervised contrastive learning. In this representation, outcome similarity is captured by proximity, and a single global advantage vector orients the space from losing to winning regions. Candidate actions are then ranked according to their alignment with this direction, reducing planning to vector operations in latent space. We demonstrate this approach in chess, where SOLIS uses only a shallow search guided by the learned embedding to reach competitive strength under constrained conditions. More broadly, our results suggest that evaluation-aligned latent planning offers a lightweight alternative to traditional dynamics models or policy learning.
Problem

Research questions and friction points this paper is trying to address.

Planning directly in learned embedding space instead of policy training
Using contrastive learning to create outcome-aligned strategic representations
Reducing complex planning to vector operations in latent space
Innovation

Methods, ideas, or system contributions that make the work stand out.

Planning via embedding arithmetic in latent space
Using supervised contrastive learning for representation
Ranking actions by alignment with global advantage vector
๐Ÿ”Ž Similar Papers
No similar papers found.
A
Andrew Hamara
Department of Computer Science, Baylor University, Waco, TX 76798, USA
G
Greg Hamerly
Department of Computer Science, Baylor University, Waco, TX 76798, USA
Pablo Rivas
Pablo Rivas
Computer Science, Baylor University
Deep LearningComputer VisionMachine LearningQuantum MLRemote Sensing
Andrew C. Freeman
Andrew C. Freeman
Baylor University
Multimediavideo streamingcomputer visionneuromorphic systems