Transfer Learning for Nonparametric Contextual Dynamic Pricing

📅 2025-01-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses dynamic pricing under scarce historical data in new products or markets. Method: Under the covariate shift assumption, it leverages nonparametric contextual pricing data from source domains (e.g., similar products/markets) to enhance decision-making in the target domain. It proposes Transfer Learning-based Dynamic Pricing (TLDP), integrating nonparametric regression, Lipschitz continuity analysis, and online regret theory. Contribution/Results: The work establishes, for the first time, a minimax lower bound for nonparametric dynamic pricing with transfer learning. TLDP is proven to achieve the optimal regret bound of $O(T^{1/2})$, subsuming the pure target-domain setting as a special case. Experiments demonstrate that TLDP significantly outperforms existing methods under data scarcity, yielding substantial improvements in pricing revenue.

Technology Category

Application Category

📝 Abstract
Dynamic pricing strategies are crucial for firms to maximize revenue by adjusting prices based on market conditions and customer characteristics. However, designing optimal pricing strategies becomes challenging when historical data are limited, as is often the case when launching new products or entering new markets. One promising approach to overcome this limitation is to leverage information from related products or markets to inform the focal pricing decisions. In this paper, we explore transfer learning for nonparametric contextual dynamic pricing under a covariate shift model, where the marginal distributions of covariates differ between source and target domains while the reward functions remain the same. We propose a novel Transfer Learning for Dynamic Pricing (TLDP) algorithm that can effectively leverage pre-collected data from a source domain to enhance pricing decisions in the target domain. The regret upper bound of TLDP is established under a simple Lipschitz condition on the reward function. To establish the optimality of TLDP, we further derive a matching minimax lower bound, which includes the target-only scenario as a special case and is presented for the first time in the literature. Extensive numerical experiments validate our approach, demonstrating its superiority over existing methods and highlighting its practical utility in real-world applications.
Problem

Research questions and friction points this paper is trying to address.

Dynamic Pricing
New Product
Revenue Maximization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transfer Learning
Dynamic Pricing
Algorithmic Superiority
F
Fan Wang
Department of Statistics, University of Warwick
Feiyu Jiang
Feiyu Jiang
Fudan University
statisticseconometrics
Zifeng Zhao
Zifeng Zhao
University of Notre Dame
change-point analysisonline learningcopulaextreme value theorytime series
Y
Yi Yu
Department of Statistics, University of Warwick