Latent Planning via Embedding Arithmetic: A Contrastive Approach to Strategic Reasoning

📅 2025-11-12

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

This work addresses strategic reasoning in high-dimensional decision spaces by proposing a novel planning paradigm that eschews policy/value networks and explicit dynamical models: lightweight vector-arithmetic planning directly within a semantically aligned embedding space. The method constructs an evaluation embedding space via supervised contrastive learning, ensuring semantic similarity of outcomes correlates with Euclidean distance in the embedding space; introduces a global advantage vector to rank actions; and formulates planning as a vector alignment problem in the latent space. Experiments demonstrate that shallow search suffices to achieve strong adversarial performance in chess, significantly reducing planning computational overhead. The core contribution is the first decoupling of strategic planning into a geometric alignment problem in embedding space—yielding a scalable, interpretable pathway for autonomous reasoning in large language models.

Technology Category

Application Category

📝 Abstract

Planning in high-dimensional decision spaces is increasingly being studied through the lens of learned representations. Rather than training policies or value heads, we investigate whether planning can be carried out directly in an evaluation-aligned embedding space. We introduce SOLIS, which learns such a space using supervised contrastive learning. In this representation, outcome similarity is captured by proximity, and a single global advantage vector orients the space from losing to winning regions. Candidate actions are then ranked according to their alignment with this direction, reducing planning to vector operations in latent space. We demonstrate this approach in chess, where SOLIS uses only a shallow search guided by the learned embedding to reach competitive strength under constrained conditions. More broadly, our results suggest that evaluation-aligned latent planning offers a lightweight alternative to traditional dynamics models or policy learning.

Problem

Research questions and friction points this paper is trying to address.

Planning directly in learned embedding space instead of policy training

Using contrastive learning to create outcome-aligned strategic representations

Reducing complex planning to vector operations in latent space

Innovation

Methods, ideas, or system contributions that make the work stand out.

Planning via embedding arithmetic in latent space

Using supervised contrastive learning for representation

Ranking actions by alignment with global advantage vector

🔎 Similar Papers

No similar papers found.