Learning to Plan via Supervised Contrastive Learning and Strategic Interpolation: A Chess Case Study

๐Ÿ“… 2025-06-05
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Human chess players rely primarily on intuitive pattern recognition rather than exhaustive tree search for planning. Method: We propose a tree-search-free embedded planning paradigm. Specifically, we construct a board-state embedding space via supervised contrastive learning (SCL), where Euclidean distance reflects evaluation similarity; strategic interpolation enables directed navigation toward higher-value regions. Planning is performed within this latent space using a Transformer encoder, with move validation via 6-ply beam search. Contribution/Results: This work introduces the first integration of SCL and strategic interpolation to build an interpretable, navigable planning latent space. It establishes a novel paradigm for move selection entirely within the embedding spaceโ€”without explicit search trees. Experiments show the model achieves an Elo rating of 2593; performance improves monotonically with model scale and embedding dimensionality. Trajectory visualizations reveal clear, human-intuitive planning paths, validating the cognitive plausibility of the approach.

Technology Category

Application Category

๐Ÿ“ Abstract
Modern chess engines achieve superhuman performance through deep tree search and regressive evaluation, while human players rely on intuition to select candidate moves followed by a shallow search to validate them. To model this intuition-driven planning process, we train a transformer encoder using supervised contrastive learning to embed board states into a latent space structured by positional evaluation. In this space, distance reflects evaluative similarity, and visualized trajectories display interpretable transitions between game states. We demonstrate that move selection can occur entirely within this embedding space by advancing toward favorable regions, without relying on deep search. Despite using only a 6-ply beam search, our model achieves an estimated Elo rating of 2593. Performance improves with both model size and embedding dimensionality, suggesting that latent planning may offer a viable alternative to traditional search. Although we focus on chess, the proposed embedding-based planning method can be generalized to other perfect-information games where state evaluations are learnable. All source code is available at https://github.com/andrewhamara/SOLIS.
Problem

Research questions and friction points this paper is trying to address.

Model intuition-driven chess move selection without deep search
Embed board states into evaluative-similarity-structured latent space
Generalize embedding-based planning to perfect-information games
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses supervised contrastive learning for embeddings
Employs strategic interpolation in latent space
Relies on shallow search instead of deep
๐Ÿ”Ž Similar Papers
No similar papers found.