Regret Minimization for Piecewise Linear Rewards: Contracts, Auctions, and Beyond

πŸ“… 2025-03-03
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This paper studies online learning and regret minimization for unknown stochastic piecewise-linear reward functions in microeconomic settings, including hidden-action principal-agent contract design, posted-price auction pricing, and first-price auction bidding. Under a setting where the function’s parameters are unknown and drawn from an unknown distribution, we propose the first unified learning framework. Leveraging monotonicity, it integrates online convex optimization, piecewise-structure-aware adaptive exploration-exploitation, gradient estimation, and interval-shrinking techniques to achieve a tight ( ilde{O}(sqrt{nT})) regret bound when (n leq T^{1/3}). This improves significantly upon prior ( ilde{O}(T^{2/3})) bounds and resolves two long-standing open problems: (i) improving the regret bound for linear contract learning, and (ii) attaining instance-independent optimality for posted-price auctions. Our framework thus provides the first theoretically optimal guarantees for these two canonical microeconomic learning problems.

Technology Category

Application Category

πŸ“ Abstract
Most microeconomic models of interest involve optimizing a piecewise linear function. These include contract design in hidden-action principal-agent problems, selling an item in posted-price auctions, and bidding in first-price auctions. When the relevant model parameters are unknown and determined by some (unknown) probability distributions, the problem becomes learning how to optimize an unknown and stochastic piecewise linear reward function. Such a problem is usually framed within an online learning framework, where the decision-maker (learner) seeks to minimize the regret of not knowing an optimal decision in hindsight. This paper introduces a general online learning framework that offers a unified approach to tackle regret minimization for piecewise linear rewards, under a suitable monotonicity assumption commonly satisfied by microeconomic models. We design a learning algorithm that attains a regret of $widetilde{O}(sqrt{nT})$, where $n$ is the number of ``pieces'' of the reward function and $T$ is the number of rounds. This result is tight when $n$ is emph{small} relative to $T$, specifically when $n leq T^{1/3}$. Our algorithm solves two open problems in the literature on learning in microeconomic settings. First, it shows that the $widetilde{O}(T^{2/3})$ regret bound obtained by Zhu et al. [Zhu+23] for learning optimal linear contracts in hidden-action principal-agent problems is not tight when the number of agent's actions is small relative to $T$. Second, our algorithm demonstrates that, in the problem of learning to set prices in posted-price auctions, it is possible to attain suitable (and desirable) instance-independent regret bounds, addressing an open problem posed by Cesa-Bianchi et al. [CBCP19].
Problem

Research questions and friction points this paper is trying to address.

Optimizing unknown stochastic piecewise linear reward functions.
Minimizing regret in online learning for microeconomic models.
Achieving tight regret bounds for small number of reward pieces.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Online learning framework for regret minimization
Algorithm achieves tight regret bound O(sqrt(nT))
Solves open problems in microeconomic learning settings
πŸ”Ž Similar Papers
No similar papers found.