Achieving constant regret for dynamic matching via state-independent policies

📅 2025-03-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the problem of achieving constant regret in dynamic two-sided matching under settings where real-time queue-length information is unavailable and only static (state-independent) policies are permitted. Focusing on finite agent-type spaces and stochastic arrival models—such as kidney exchange—we propose the first randomized greedy policy that attains the optimal $O(varepsilon^{-1})$ constant regret for arbitrary networks. Methodologically, we introduce the Generalized Positional Gap (GPG) to characterize structural properties of matchings and derive its first explicit regret upper bound. By integrating fluid relaxation analysis with deterministic priority-based policy design, we prove that deterministic policies achieve constant regret on acyclic networks and that our randomized policy tightly matches known lower bounds on general networks. The results provide a deployable, robust, and theoretically optimal decision foundation for safety-critical real-time matching systems.

Technology Category

Application Category

📝 Abstract
We study a centralized discrete-time dynamic two-way matching model with finitely many agent types. Agents arrive stochastically over time and join their type-dedicated queues waiting to be matched. We focus on state-independent greedy policies that achieve constant regret at all times by making matching decisions based solely on agent availability across types, rather than requiring complete queue-length information. Such policies are particularly appealing for life-saving applications such as kidney exchange, as they require less information and provide more transparency compared to state-dependent policies. First, for acyclic matching networks, we analyze a deterministic priority policy proposed by Kerimov et al. [2023] that follows a static priority order over matches. We derive the first explicit regret bound in terms of the general position gap (GPG) parameter $epsilon$, which measures the distance of the fluid relaxation from degeneracy. Second, for general two-way matching networks, we design a randomized state-independent greedy policy that achieves constant regret with optimal scaling $O(epsilon^{-1})$, matching the existing lower bound established by Kerimov et al. [2024].
Problem

Research questions and friction points this paper is trying to address.

Dynamic two-way matching with stochastic agent arrivals.
State-independent policies for constant regret in matching.
Applications in life-saving scenarios like kidney exchange.
Innovation

Methods, ideas, or system contributions that make the work stand out.

State-independent greedy policies for dynamic matching
Deterministic priority policy for acyclic networks
Randomized greedy policy with optimal regret scaling
🔎 Similar Papers
No similar papers found.
S
Suleyman Kerimov
Jones Graduate School of Business, Rice University, Houston TX, USA
Mingwei Yang
Mingwei Yang
Stanford University
Theoretical Computer Science
S
Sophie H. Yu
The Wharton School of Business, University of Pennsylvania, Philadelphia PA, USA