An Efficient Algorithm for Thresholding Monte Carlo Tree Search

πŸ“… 2026-01-30
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the problem of determining whether the root value in a Monte Carlo tree search exceeds a given threshold, where internal nodes alternate between MAX and MIN operations and leaf node values correspond to the means of unknown distributions. To tackle this, the authors propose a Ξ΄-correct sequential sampling algorithm built upon the Track-and-Stop framework, featuring an innovative ratio-corrected D-Tracking strategy for arm selection. The method preserves asymptotic optimality in sample complexity while substantially reducing the actual number of samples required in practice. Furthermore, it improves computational efficiency by lowering the per-round time complexity from linear to logarithmic. Empirical evaluations demonstrate the algorithm’s dual advantages in both sample efficiency and computational speed.

Technology Category

Application Category

πŸ“ Abstract
We introduce the Thresholding Monte Carlo Tree Search problem, in which, given a tree $\mathcal{T}$ and a threshold $\theta$, a player must answer whether the root node value of $\mathcal{T}$ is at least $\theta$ or not. In the given tree, `MAX'or `MIN'is labeled on each internal node, and the value of a `MAX'-labeled (`MIN'-labeled) internal node is the maximum (minimum) of its child values. The value of a leaf node is the mean reward of an unknown distribution, from which the player can sample rewards. For this problem, we develop a $\delta$-correct sequential sampling algorithm based on the Track-and-Stop strategy that has asymptotically optimal sample complexity. We show that a ratio-based modification of the D-Tracking arm-pulling strategy leads to a substantial improvement in empirical sample complexity, as well as reducing the per-round computational cost from linear to logarithmic in the number of arms.
Problem

Research questions and friction points this paper is trying to address.

Thresholding
Monte Carlo Tree Search
Sequential Sampling
Optimal Sample Complexity
Decision Threshold
Innovation

Methods, ideas, or system contributions that make the work stand out.

Thresholding Monte Carlo Tree Search
Track-and-Stop
D-Tracking
sample complexity
sequential sampling
πŸ”Ž Similar Papers
No similar papers found.
S
Shoma Nameki
Graduate School of Information Science and Technology, Hokkaido University
Atsuyoshi Nakamura
Atsuyoshi Nakamura
Hokkaido University
Machine learningData MiningComputational Learning Theory
Junpei Komiyama
Junpei Komiyama
New York University / MBZUAI / RIKEN
Artificial IntelligenceMachine Learning
K
Koji Tabata
Research Institute for Electronic Science, Hokkaido University