🤖 AI Summary
This work proposes an efficient model-free reinforcement learning algorithm that overcomes the high computational cost and poor reproducibility of traditional search-based methods like AlphaZero in board games. By eliminating reliance on explicit environment models and large-scale tree search, the approach integrates policy gradient and value estimation while incorporating key training mechanisms to enhance sample efficiency. Evaluated across five distinct board games—Animal Shogi, Gardner Chess, Go, Hex, and Othello—the method consistently achieves substantially higher learning efficiency than existing search-based approaches. This study presents the first demonstration of the superiority of a model-free paradigm across multiple board game domains, thereby challenging the long-standing assumption that explicit search is indispensable for high-performance game playing.
📝 Abstract
Board games have long served as complex decision-making benchmarks in artificial intelligence. In this field, search-based reinforcement learning methods such as AlphaZero have achieved remarkable success. However, their significant computational demands have been pointed out as barriers to their reproducibility. In this study, we propose a model-free reinforcement learning algorithm designed for board games to achieve more efficient learning. To validate the efficiency of the proposed method, we conducted comprehensive experiments on five board games: Animal Shogi, Gardner Chess, Go, Hex, and Othello. The results demonstrate that the proposed method achieves more efficient learning than existing methods across these environments. In addition, our extensive ablation study shows the importance of core techniques used in the proposed method. We believe that our efficient algorithm shows the potential of model-free reinforcement learning in domains traditionally dominated by search-based methods.