A Tight Lower Bound for Non-stochastic Multi-armed Bandits with Expert Advice

📅 2025-10-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work establishes the minimax expected regret lower bound for nonstochastic multi-armed bandits with expert advice. Specifically, for the setting of $K$ arms, $N$ experts, and $T$ rounds, we construct adversarial instances and employ information-theoretic and combinatorial arguments to rigorously prove a tight lower bound of $Omegaig(sqrt{T K log(N/K)}ig)$, matching the best known upper bound. Our analysis fully characterizes the optimal regret rate for this model—resolving a long-standing open problem in the theory of expert-augmented bandits. This result closes the theoretical gap between upper and lower bounds, thereby completing the foundational understanding of regret limits in adversarial bandits with expert feedback. The proof technique, combining adversarial construction with refined information-theoretic reasoning, advances the methodological toolkit for deriving sharp minimax bounds in sequential decision-making under uncertainty.

Technology Category

Application Category

📝 Abstract
We determine the minimax optimal expected regret in the classic non-stochastic multi-armed bandit with expert advice problem, by proving a lower bound that matches the upper bound of Kale (2014). The two bounds determine the minimax optimal expected regret to be $Θleft( sqrt{T K log (N/K) } ight)$, where $K$ is the number of arms, $N$ is the number of experts, and $T$ is the time horizon.
Problem

Research questions and friction points this paper is trying to address.

Determining minimax optimal regret for non-stochastic bandits
Proving tight lower bound matching existing upper bound
Establishing regret scaling with arms, experts and time
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proved tight minimax regret lower bound
Matched existing upper bound for bandits
Determined optimal regret with expert advice
🔎 Similar Papers
No similar papers found.