๐ค AI Summary
This paper addresses the online expert problem by proposing the first instance-adaptive ฯ-regret minimization algorithm, unifying optimization of external, internal, and swap regret. Methodologically, it achieves the first smooth interpolation among these three regret notions; introduces a Haar-wavelet-inspired matrix feature mapping; integrates comparator-adaptive online linear regression with โโ constraints; exploits the sparse structure induced by action-modification rules; and employs randomized matrix sketching for computational efficiency. Theoretically, it establishes a tight instance-adaptive regret bound of ร(โ[min{d โ d^unif_ฯ + 1, d โ d^self_ฯ}] ยท โT), where d^unif_ฯ and d^self_ฯ characterize problem-specific structural parameters. Moreover, it recoversโand strictly improves uponโthe optimal bounds for external, internal, swap, and quantile regret.
๐ Abstract
Focusing on the expert problem in online learning, this paper studies the interpolation of several performance metrics via $phi$-regret minimization, which measures the performance of an algorithm by its regret with respect to an arbitrary action modification rule $phi$. With $d$ experts and $Tgg d$ rounds in total, we present a single algorithm achieving the instance-adaptive $phi$-regret bound egin{equation*} ilde Oleft(minleft{sqrt{d-d^{mathrm{unif}}_phi+1},sqrt{d-d^{mathrm{self}}_phi}
ight}cdotsqrt{T}
ight), end{equation*} where $d^{mathrm{unif}}_phi$ is the maximum amount of experts modified identically by $phi$, and $d^{mathrm{self}}_phi$ is the amount of experts that $phi$ trivially modifies to themselves. By recovering the optimal $O(sqrt{Tlog d})$ external regret bound when $d^{mathrm{unif}}_phi=d$, the standard $ ilde O(sqrt{T})$ internal regret bound when $d^{mathrm{self}}_phi=d-1$ and the optimal $ ilde O(sqrt{dT})$ swap regret bound in the worst case, we improve existing results in the intermediate regimes. In addition, the same algorithm achieves the optimal quantile regret bound, which corresponds to even easier settings of $phi$ than the external regret. Building on the classical reduction from $phi$-regret minimization to external regret minimization on stochastic matrices, our main idea is to further convert the latter to online linear regression using Haar-wavelet-inspired matrix features. Then, we apply a particular $L_1$-version of comparator-adaptive online learning algorithms to exploit the sparsity in this regression subroutine.