Data-Driven Automated Mechanism Design using Multi-Agent Revealed Preferences

📅 2024-04-23
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses mechanism design for social optimality in black-box multi-agent games: given unknown agent utility functions and only observable Nash equilibrium strategy responses, how can a socially optimal mechanism be automatically constructed via active querying and observation? We propose the first necessary and sufficient revealed-preference test for Pareto optimality; develop a novel mechanism design framework that jointly integrates inverse reinforcement learning with policy gradient methods; and establish a theoretical connection between the loss function and a robust revealed-preference metric. We prove that the algorithm converges to a mechanism minimizing the Pareto gap, derive finite-sample concentration bounds for convergence, and extend the framework to distributionally robust mechanism design under partial strategy observations.

Technology Category

Application Category

📝 Abstract
Suppose a black box, representing multiple agents, generates decisions from a mixed-strategy Nash equilibrium of a game. Assume that we can choose the input vector to the black box and this affects the utilities of the agents, but we do not know the utilities of the individual agents. By viewing the decisions from the black box, how can we steer the Nash equilibrium to a socially optimal point? This paper constructs a reinforcement learning (RL) framework for adaptively achieving this mechanism design objective. We first derive a novel multi-agent revealed preference test for Pareto optimality -- this yields necessary and sufficient conditions for the existence of utility functions under which empirically observed mixed-strategy Nash equilibria are socially optimal. These conditions take the form of a testable linear program, and this result is of independent interest. We utilize this result to construct an inverse reinforcement learning (IRL) step to determine the Pareto gap, i.e., the distance of observed strategies from Pareto optimality. We pair this IRL step with an RL policy gradient algorithm and prove convergence to a mechanism which minimizes the Pareto gap, thereby inducing social optimality in equilibria strategies. We also reveal an intimate connection between our constructed loss function and several robust revealed preference metrics; this allows us to reason about algorithmic suboptimality through the lens of these well-established microeconomic principles. Finally, in the case when only finitely many i.i.d. samples from mixed-strategies (partial strategy specifications) are available, we derive concentration bounds for our algorithm's convergence, and we construct a distributionally robust RL procedure which achieves mechanism design for the fully specified strategies.
Problem

Research questions and friction points this paper is trying to address.

Steer Nash equilibrium to socially optimal point using multi-agent revealed preferences.
Develop reinforcement learning framework for adaptive mechanism design.
Derive concentration bounds for algorithm convergence with finite samples.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning framework for mechanism design
Multi-agent revealed preference test for Pareto optimality
Inverse reinforcement learning to minimize Pareto gap
🔎 Similar Papers
L
Luke Snow
School of Electrical and Computer Engineering, Cornell University, Ithaca, NY 14853, USA
Vikram Krishnamurthy
Vikram Krishnamurthy
Professor, Cornell University
controlled sensingstatistical signal processingsocial networksinverse reinforcement learning