Revisiting Follow-the-Perturbed-Leader with Unbounded Perturbations in Bandit Problems

📅 2025-08-25

📈 Citations: 0

✨ Influential: 0

career value

248K/year

🤖 AI Summary

This paper investigates online learning in constrained games under both stochastic and adversarial environments. We propose a Follow-the-Perturbed-Leader (FTPL) framework based on unbounded Fréchet-type perturbations. First, we establish the first *dual-world optimal* performance guarantee for FTPL under *asymmetric* Fréchet perturbations. Second, we uncover an intrinsic connection between Tsallis entropy and *symmetric* Fréchet perturbations, revealing fundamental distinctions between two- and multi-armed settings. Third, we design a Gumbel-Fréchet *hybrid tail perturbation* mechanism and, leveraging FTRL–FTPL duality, achieve *tight dual-world optimality* for FTPL over a generalized perturbation family. Theoretically, we prove that symmetric perturbations are optimal in the two-armed case but suffer inherent limitations in the multi-armed setting. Extensive numerical experiments corroborate these theoretical findings.

Technology Category

Application Category

📝 Abstract

Follow-the-Regularized-Leader (FTRL) policies have achieved Best-of-Both-Worlds (BOBW) results in various settings through hybrid regularizers, whereas analogous results for Follow-the-Perturbed-Leader (FTPL) remain limited due to inherent analytical challenges. To advance the analytical foundations of FTPL, we revisit classical FTRL-FTPL duality for unbounded perturbations and establish BOBW results for FTPL under a broad family of asymmetric unbounded Fréchet-type perturbations, including hybrid perturbations combining Gumbel-type and Fréchet-type tails. These results not only extend the BOBW results of FTPL but also offer new insights into designing alternative FTPL policies competitive with hybrid regularization approaches. Motivated by earlier observations in two-armed bandits, we further investigate the connection between the $1/2$-Tsallis entropy and a Fréchet-type perturbation. Our numerical observations suggest that it corresponds to a symmetric Fréchet-type perturbation, and based on this, we establish the first BOBW guarantee for symmetric unbounded perturbations in the two-armed setting. In contrast, in general multi-armed bandits, we find an instance in which symmetric Fréchet-type perturbations violate the key condition for standard BOBW analysis, which is a problem not observed with asymmetric or nonnegative Fréchet-type perturbations. Although this example does not rule out alternative analyses achieving BOBW results, it suggests the limitations of directly applying the relationship observed in two-armed cases to the general case and thus emphasizes the need for further investigation to fully understand the behavior of FTPL in broader settings.

Problem

Research questions and friction points this paper is trying to address.

Extends BOBW results for FTPL with unbounded asymmetric perturbations.

Investigates link between Tsallis entropy and symmetric Fréchet-type perturbations.

Identifies limitations of symmetric perturbations in multi-armed bandit analysis.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using asymmetric unbounded Fréchet-type perturbations for FTPL

Linking Tsallis entropy to symmetric Fréchet-type perturbations

Establishing Best-of-Both-Worlds guarantees for FTPL in bandits

🔎 Similar Papers

Multi-Player Approaches for Dueling Bandits