Fast EXP3 Algorithms

📅 2025-12-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the prohibitively high per-round $O(K)$ time complexity of the EXP3 algorithm in adversarial multi-armed bandits, this paper proposes the first $O(1)$-time-per-round implementation of EXP3. Our approach introduces three key techniques: (i) an improved randomized weighted sampling scheme; (ii) a lazy probability update mechanism; and (iii) a binary tree structure for maintaining the cumulative distribution function—combined with amortized analysis and refined regret upper-bound derivation. The resulting algorithm achieves optimal asymptotic regret $O(sqrt{KT log K})$ while reducing per-round time complexity to constant. Empirical evaluation demonstrates 10–100× speedup over standard EXP3. Furthermore, we formally characterize the fundamental regret–time trade-off frontier and design a family of efficient variants that achieve superior balance between theoretical guarantees and practical runtime efficiency.

Technology Category

Application Category

📝 Abstract
We point out that EXP3 can be implemented in constant time per round, propose more practical algorithms, and analyze the trade-offs between the regret bounds and time complexities of these algorithms.
Problem

Research questions and friction points this paper is trying to address.

Implement EXP3 in constant time per round
Propose more practical algorithms for EXP3
Analyze regret bounds versus time complexities trade-offs
Innovation

Methods, ideas, or system contributions that make the work stand out.

EXP3 implemented in constant time per round
More practical algorithms proposed
Trade-offs between regret bounds and time complexities analyzed
🔎 Similar Papers
No similar papers found.
R
Ryoma Sato
National Institute of Informatics
Shinji Ito
Shinji Ito
The University of Tokyo