Data-Driven Automated Mechanism Design using Multi-Agent Revealed Preferences

📅 2024-04-23

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

This paper addresses mechanism design for social optimality in black-box multi-agent games: given unknown agent utility functions and only observable Nash equilibrium strategy responses, how can a socially optimal mechanism be automatically constructed via active querying and observation? We propose the first necessary and sufficient revealed-preference test for Pareto optimality; develop a novel mechanism design framework that jointly integrates inverse reinforcement learning with policy gradient methods; and establish a theoretical connection between the loss function and a robust revealed-preference metric. We prove that the algorithm converges to a mechanism minimizing the Pareto gap, derive finite-sample concentration bounds for convergence, and extend the framework to distributionally robust mechanism design under partial strategy observations.

Technology Category

Application Category

📝 Abstract

Suppose a black box, representing multiple agents, generates decisions from a mixed-strategy Nash equilibrium of a game. Assume that we can choose the input vector to the black box and this affects the utilities of the agents, but we do not know the utilities of the individual agents. By viewing the decisions from the black box, how can we steer the Nash equilibrium to a socially optimal point? This paper constructs a reinforcement learning (RL) framework for adaptively achieving this mechanism design objective. We first derive a novel multi-agent revealed preference test for Pareto optimality -- this yields necessary and sufficient conditions for the existence of utility functions under which empirically observed mixed-strategy Nash equilibria are socially optimal. These conditions take the form of a testable linear program, and this result is of independent interest. We utilize this result to construct an inverse reinforcement learning (IRL) step to determine the Pareto gap, i.e., the distance of observed strategies from Pareto optimality. We pair this IRL step with an RL policy gradient algorithm and prove convergence to a mechanism which minimizes the Pareto gap, thereby inducing social optimality in equilibria strategies. We also reveal an intimate connection between our constructed loss function and several robust revealed preference metrics; this allows us to reason about algorithmic suboptimality through the lens of these well-established microeconomic principles. Finally, in the case when only finitely many i.i.d. samples from mixed-strategies (partial strategy specifications) are available, we derive concentration bounds for our algorithm's convergence, and we construct a distributionally robust RL procedure which achieves mechanism design for the fully specified strategies.

Problem

Research questions and friction points this paper is trying to address.

Steer Nash equilibrium to socially optimal point using multi-agent revealed preferences.

Develop reinforcement learning framework for adaptive mechanism design.

Derive concentration bounds for algorithm convergence with finite samples.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning framework for mechanism design

Multi-agent revealed preference test for Pareto optimality

Inverse reinforcement learning to minimize Pareto gap

🔎 Similar Papers

Deep Learning Meets Mechanism Design: Key Results and Some Novel Applications

2024-01-11arXiv.orgCitations: 0

Automated Design of Affine Maximizer Mechanisms in Dynamic Settings

2024-02-12AAAI Conference on Artificial IntelligenceCitations: 4

Authors to Follow