🤖 AI Summary
This study addresses the dynamic joint product selection and ranking problem, where product attractiveness is jointly determined by intrinsic quality and display position. Under a multinomial logit choice model, the authors develop online learning algorithms based on single-round feedback for both multiplicative and general position effects. The key contribution lies in establishing matching upper and lower regret bounds for both models—resolving a longstanding √K gap in prior theoretical analyses—and enabling real-time decision-making. The proposed approach integrates cross-position pairwise maximum likelihood estimation, upper confidence bound strategies, the Dinkelbach algorithm, and maximum-weight bipartite matching. Experimental results demonstrate that the algorithms significantly outperform existing benchmarks on both synthetic data and the real-world Expedia dataset, achieving both theoretical optimality and practical efficacy.
📝 Abstract
We study the dynamic joint assortment selection and positioning problem, where the attraction of each product depends on both its intrinsic appeal and its display position under a Multinomial Logit (MNL) choice framework. Our study ranges from the multiplicative position effects model, in which each product's attraction is scaled by a position-specific factor, to a general position effects model assigning independent attraction parameters to every product--position pair to capture heterogeneous synergies. For both models, we design round-based learning algorithms that update decisions after every single feedback, and establish the first regret-optimal characterization. Besides, our round-based algorithms provide the prompt operations needed by modern platforms. For the multiplicative model, we develop a cross-position pairwise maximum likelihood estimator with a clipping mechanism, and prove that our algorithm P2MLE-UCB attains a regret of $\tilde{O}(\sqrt{NT})$, matching the lower bound and closing the $\sqrt{K}$ gap left by prior epoch-based analyses. For the general model, we establish a minimax lower bound and propose GP2-UCB with a matching upper bound. Moreover, we design an efficient subroutine for the per-round joint assortment and positioning optimization based on Dinkelbach's method and maximum-weight bipartite matching. Numerical experiments on synthetic data and the Expedia dataset show that our algorithms consistently outperform state-of-the-art benchmarks.