No-Regret Learning in Stackelberg Games with an Application to Electric Ride-Hailing

📅 2025-04-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In single-leader–multiple-follower Stackelberg games, the leader often lacks prior knowledge of followers’ objectives, constraints, or private information—posing a fundamental challenge for equilibrium learning. Method: We propose a gradient-free, black-box online learning framework that integrates kernel ridge regression to model the leader’s cost function, no-regret learning, and Stackelberg equilibrium approximation analysis—without accessing followers’ private data or gradients. Contribution/Results: Our algorithm achieves $O(sqrt{T})$ convergence to an $varepsilon$-Stackelberg equilibrium and attains a regret bound of the same order. To our knowledge, this is the first fully black-box Stackelberg learning algorithm with provable convergence guarantees. Empirical evaluation on joint electric ride-hailing fleet scheduling and charging pricing demonstrates significant improvements in system efficiency and charging load balancing.

Technology Category

Application Category

📝 Abstract
We consider the problem of efficiently learning to play single-leader multi-follower Stackelberg games when the leader lacks knowledge of the lower-level game. Such games arise in hierarchical decision-making problems involving self-interested agents. For example, in electric ride-hailing markets, a central authority aims to learn optimal charging prices to shape fleet distributions and charging patterns of ride-hailing companies. Existing works typically apply gradient-based methods to find the leader's optimal strategy. Such methods are impractical as they require that the followers share private utility information with the leader. Instead, we treat the lower-level game as a black box, assuming only that the followers' interactions approximate a Nash equilibrium while the leader observes the realized cost of the resulting approximation. Under kernel-based regularity assumptions on the leader's cost function, we develop a no-regret algorithm that converges to an $epsilon$-Stackelberg equilibrium in $O(sqrt{T})$ rounds. Finally, we validate our approach through a numerical case study on optimal pricing in electric ride-hailing markets.
Problem

Research questions and friction points this paper is trying to address.

Learning Stackelberg games without follower utility knowledge
Optimizing charging prices in electric ride-hailing markets
Achieving no-regret convergence in hierarchical decision-making
Innovation

Methods, ideas, or system contributions that make the work stand out.

No-regret learning in Stackelberg games
Black-box treatment of lower-level game
Kernel-based cost function assumptions
🔎 Similar Papers
No similar papers found.