🤖 AI Summary
This paper studies online contextual pricing under strategic buyers, who can misreport valuations to manipulate future prices—inducing “strategic overfitting.” To address this, the authors introduce *policy-robust regret*, a new metric ensuring robustness against strategic manipulation. They design the *Sparse Update Mechanism (SUM)*, the first algorithm achieving robustness across all Nash equilibria in multi-buyer settings. Further, they propose a black-box reduction framework that transforms any online expert algorithm into a strategy-robust learner. By integrating online sketching techniques with a polynomial-time approximation scheme (PTAS), their approach enables efficient learning of linear pricing policies against adversarial and adaptive strategic buyers. The method achieves the optimal worst-case regret bound, significantly enhancing the robustness and practicality of dynamic pricing mechanisms.
📝 Abstract
Learning effective pricing strategies is crucial in digital marketplaces, especially when buyers' valuations are unknown and must be inferred through interaction. We study the online contextual pricing problem, where a seller observes a stream of context-valuation pairs and dynamically sets prices. Moreover, departing from traditional online learning frameworks, we consider a strategic setting in which buyers may misreport valuations to influence future prices, a challenge known as strategic overfitting (Amin et al., 2013).
We introduce a strategy-robust notion of regret for multi-buyer online environments, capturing worst-case strategic behavior in the spirit of the Price of Anarchy. Our first contribution is a polynomial-time approximation scheme (PTAS) for learning linear pricing policies in adversarial, adaptive environments, enabled by a novel online sketching technique. Building on this result, we propose our main construction: the Sparse Update Mechanism (SUM), a simple yet effective sequential mechanism that ensures robustness to all Nash equilibria among buyers. Moreover, our construction yields a black-box reduction from online expert algorithms to strategy-robust learners.