Fast EXP3 Algorithms

📅 2025-12-11

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

To address the prohibitively high per-round $O(K)$ time complexity of the EXP3 algorithm in adversarial multi-armed bandits, this paper proposes the first $O(1)$-time-per-round implementation of EXP3. Our approach introduces three key techniques: (i) an improved randomized weighted sampling scheme; (ii) a lazy probability update mechanism; and (iii) a binary tree structure for maintaining the cumulative distribution function—combined with amortized analysis and refined regret upper-bound derivation. The resulting algorithm achieves optimal asymptotic regret $O(sqrt{KT log K})$ while reducing per-round time complexity to constant. Empirical evaluation demonstrates 10–100× speedup over standard EXP3. Furthermore, we formally characterize the fundamental regret–time trade-off frontier and design a family of efficient variants that achieve superior balance between theoretical guarantees and practical runtime efficiency.

Technology Category

Application Category

📝 Abstract

We point out that EXP3 can be implemented in constant time per round, propose more practical algorithms, and analyze the trade-offs between the regret bounds and time complexities of these algorithms.

Problem

Research questions and friction points this paper is trying to address.

Implement EXP3 in constant time per round

Propose more practical algorithms for EXP3

Analyze regret bounds versus time complexities trade-offs

Innovation

Methods, ideas, or system contributions that make the work stand out.

EXP3 implemented in constant time per round

More practical algorithms proposed

Trade-offs between regret bounds and time complexities analyzed

🔎 Similar Papers

No similar papers found.