Tracking vs. Deciding: The Dual-Capability Bottleneck in Searchless Chess Transformers

📅 2026-03-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses a key performance bottleneck in search-free chess Transformers: the conflicting data requirements for simultaneously learning position tracking and move-quality estimation. To resolve this, the authors propose a dual-capacity bottleneck framework that enhances tracking capability through model scaling and improves decision-making via Elo-weighted loss, yielding superadditive gains through their synergy. The approach enables, for the first time, human-like, search-free play using only complete historical move sequences. Additionally, the study introduces a coverage decay formula to assess game degradation risk. Evaluated on Lichess blitz games, a 120M-parameter model achieves a 2570 Elo rating and attains a 55.2% top-1 accuracy in predicting human moves, outperforming the Maia-2 model series.
📝 Abstract
A human-like chess engine should mimic the style, errors, and consistency of a strong human player rather than maximize playing strength. We show that training from move sequences alone forces a model to learn two capabilities: state tracking, which reconstructs the board from move history, and decision quality, which selects good moves from that reconstructed state. These impose contradictory data requirements: low-rated games provide the diversity needed for tracking, while high-rated games provide the quality signal for decision learning. Removing low-rated data degrades performance. We formalize this tension as a dual-capability bottleneck, P <= min(T,Q), where overall performance is limited by the weaker capability. Guided by this view, we scale the model from 28M to 120M parameters to improve tracking, then introduce Elo-weighted training to improve decisions while preserving diversity. A 2 x 2 factorial ablation shows that scaling improves tracking, weighting improves decisions, and their combination is superadditive. Linear weighting works best, while overly aggressive weighting harms tracking despite lower validation loss. We also introduce a coverage-decay formula, t* = log(N/kcrit)/log b, as a reliability horizon for intra-game degeneration risk. Our final 120M-parameter model, without search, reached Lichess bullet 2570 over 253 rated games. On human move prediction it achieves 55.2% Top-1 accuracy, exceeding Maia-2 rapid and Maia-2 blitz. Unlike position-based methods, sequence input naturally encodes full game history, enabling history-dependent decisions that single-position models cannot exhibit.
Problem

Research questions and friction points this paper is trying to address.

dual-capability bottleneck
state tracking
decision quality
searchless chess
move sequence modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

dual-capability bottleneck
Elo-weighted training
searchless chess transformer
state tracking
coverage-decay formula
🔎 Similar Papers
No similar papers found.
Quanhao Li
Quanhao Li
Fudan University
Computer visionVideo Generation
W
Wei Jiang
Shanghai Soong Ching Ling School