Best Group Identification in Multi-Objective Bandits

📅 2025-05-23

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This paper studies “optimal group identification” in multi-objective multi-armed bandits: given a set of arms with vector-valued rewards, each group’s performance is characterized by its efficiency vector—i.e., the Pareto front across objectives—and the goal is to identify either the Pareto-optimal group or the linear-weighted-optimal group with guaranteed confidence. We formalize this pure-exploration problem with group structure for the first time, establishing two complementary paradigms: Pareto-set identification and linear-weighted identification. We propose an efficient algorithm integrating adaptive elimination with multi-objective front estimation, derive tight sample complexity upper and lower bounds, and prove its optimality up to constant factors. Experiments on synthetic and real-world multi-objective datasets demonstrate that our method achieves faster convergence and strictly satisfies the prescribed confidence level in misidentification probability, significantly outperforming existing baselines.

Technology Category

Application Category

📝 Abstract

We introduce the Best Group Identification problem in a multi-objective multi-armed bandit setting, where an agent interacts with groups of arms with vector-valued rewards. The performance of a group is determined by an efficiency vector which represents the group's best attainable rewards across different dimensions. The objective is to identify the set of optimal groups in the fixed-confidence setting. We investigate two key formulations: group Pareto set identification, where efficiency vectors of optimal groups are Pareto optimal and linear best group identification, where each reward dimension has a known weight and the optimal group maximizes the weighted sum of its efficiency vector's entries. For both settings, we propose elimination-based algorithms, establish upper bounds on their sample complexity, and derive lower bounds that apply to any correct algorithm. Through numerical experiments, we demonstrate the strong empirical performance of the proposed algorithms.

Problem

Research questions and friction points this paper is trying to address.

Identify optimal groups in multi-objective bandits with vector rewards

Compare group performance using Pareto optimality or weighted sum criteria

Develop algorithms with proven sample complexity bounds for group identification

Innovation

Methods, ideas, or system contributions that make the work stand out.

Elimination-based algorithms for group identification

Multi-objective multi-armed bandit setting

Fixed-confidence Pareto optimal efficiency vectors

🔎 Similar Papers

No similar papers found.

Authors to Follow