🤖 AI Summary
This work investigates how recurrent neural networks (RNNs) spontaneously develop internal planning mechanisms when solving complex, sequential decision-making tasks with irreversible actions—exemplified by Sokoban. Using behavioral analysis, causal modeling of neural dynamics, and systematic evaluation of sequence generalization, we demonstrate that trained LSTM and GRU models implicitly learn causal planning representations capable of anticipating actions up to ~50 steps ahead. We report the first discovery of an emergent “pacing” computation strategy: models periodically re-encode the initial state to extend effective planning horizons, substantially improving solution quality. These planning representations exhibit strong robustness and generalize to out-of-distribution levels far exceeding training complexity. Moreover, planning depth and accuracy dynamically scale with computational budget (i.e., number of inference steps). To foster reproducibility and further research, we publicly release all models and code. This study establishes a novel paradigm for probing and characterizing intrinsic planning capabilities in neural networks.
📝 Abstract
Planning is essential for solving complex tasks, yet the internal mechanisms underlying planning in neural networks remain poorly understood. Building on prior work, we analyze a recurrent neural network (RNN) trained on Sokoban, a challenging puzzle requiring sequential, irreversible decisions. We find that the RNN has a causal plan representation which predicts its future actions about 50 steps in advance. The quality and length of the represented plan increases over the first few steps. We uncover a surprising behavior: the RNN"paces"in cycles to give itself extra computation at the start of a level, and show that this behavior is incentivized by training. Leveraging these insights, we extend the trained RNN to significantly larger, out-of-distribution Sokoban puzzles, demonstrating robust representations beyond the training regime. We open-source our model and code, and believe the neural network's interesting behavior makes it an excellent model organism to deepen our understanding of learned planning.