SpinGPT: A Large-Language-Model Approach to Playing Poker Correctly

📅 2025-09-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional CFR-based algorithms face exponential computational complexity growth and fail to guarantee non-negative payoffs under Nash equilibrium in three-or-more-player poker, rendering them impractical for mainstream tournament formats such as Spin & Go. To address this, we propose SpinGPT—the first large language model (LLM) approach specifically designed for three-player imperfect-information poker. SpinGPT employs a novel two-stage training paradigm: (i) supervised fine-tuning on 320,000 high-quality human decision trajectories, followed by (ii) reinforcement learning optimization using 270,000 solver-generated expert demonstrations. Experimental results show that SpinGPT matches the solver’s action in 78% of decisions and achieves a statistically significant win rate of 13.4 ± 12.9 BB/100 (95% CI) against the strong baseline Slumbot. This work breaks long-standing theoretical and practical bottlenecks of classical algorithms in multi-player poker, establishing a new paradigm for LLM-based strategic reasoning in imperfect-information games.

Technology Category

Application Category

📝 Abstract
The Counterfactual Regret Minimization (CFR) algorithm and its variants have enabled the development of pokerbots capable of beating the best human players in heads-up (1v1) cash games and competing with them in six-player formats. However, CFR's computational complexity rises exponentially with the number of players. Furthermore, in games with three or more players, following Nash equilibrium no longer guarantees a non-losing outcome. These limitations, along with others, significantly restrict the applicability of CFR to the most popular formats: tournaments. Motivated by the recent success of Large Language Models (LLM) in chess and Diplomacy, we present SpinGPT, the first LLM tailored to Spin & Go, a popular three-player online poker format. SpinGPT is trained in two stages: (1) Supervised Fine-Tuning on 320k high-stakes expert decisions; (2) Reinforcement Learning on 270k solver-generated hands. Our results show that SpinGPT matches the solver's actions in 78% of decisions (tolerant accuracy). With a simple deep-stack heuristic, it achieves 13.4 +/- 12.9 BB/100 versus Slumbot in heads-up over 30,000 hands (95% CI). These results suggest that LLMs could be a new way to deal with multi-player imperfect-information games like poker.
Problem

Research questions and friction points this paper is trying to address.

Overcoming CFR's exponential complexity in multi-player poker games
Addressing Nash equilibrium limitations in three-player poker formats
Developing LLM approach for imperfect-information multi-player games
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM tailored to three-player poker format
Two-stage training with supervised fine-tuning
Reinforcement learning on solver-generated hands
🔎 Similar Papers
No similar papers found.