Two-Fidelity Best-Action Identification for Stochastic Minimax Tree

📅 2026-06-01
📈 Citations: 0
Influential: 0
📄 PDF

career value

256K/year
🤖 AI Summary
This work addresses the fundamental challenge in stochastic minimax tree search of efficiently balancing cheap but biased heuristic evaluations against expensive yet accurate simulations to identify optimal actions. The paper proposes the 2FFS algorithm, which introduces, for the first time, a two-fidelity bandit mechanism into minimax tree search. By integrating minimax rapid expansion, Monte Carlo tree sampling, and an adaptive multi-fidelity selection strategy, 2FFS achieves finite-sample stopping under a fixed confidence guarantee and provides a polynomial-depth upper bound on computational cost. Experimental results demonstrate that 2FFS substantially reduces both the number of required samples and overall computational overhead compared to existing BAI-MCTS baselines.
📝 Abstract
We study fixed-confidence best-action identification (BAI) in stochastic minimax trees. This problem is increasingly relevant in modern AI planning, where deep minimax search and Monte Carlo Tree Search (MCTS) with language model long rollouts face a fundamental tradeoff: heuristic evaluations are cheap but biased, while accurate rollouts are reliable but prohibitively expensive. We propose 2FFS, a two-fidelity tree-search algorithm that brings multi-fidelity flat bandit ideas into trees. The algorithm combines minimax-style fast expansion with MCTS-style stochastic sampling, adaptively deciding when to exploit cheap biased evaluations and when to invoke expensive accurate evaluations for local certification. We prove fixed-confidence correctness, establish finite stopping for exact identification, and give a polynomial-depth cost upper bound for general-depth trees. Across numerical stochastic-tree experiments, 2FFS uses substantially fewer samples and computational operations comparing to existing BAI-MCTS baseline.
Problem

Research questions and friction points this paper is trying to address.

best-action identification
stochastic minimax tree
two-fidelity evaluation
fixed-confidence
Monte Carlo Tree Search
Innovation

Methods, ideas, or system contributions that make the work stand out.

two-fidelity
best-action identification
stochastic minimax tree
multi-fidelity bandits
adaptive evaluation
🔎 Similar Papers
2024-06-05Neural Information Processing SystemsCitations: 1