๐ค AI Summary
This study addresses the observation bias in partially ranked consumer data caused by consideration sets. The authors propose a joint modeling approach that treats observed rankings as pairwise comparisons with logistic choice probabilities. Latent utilities are characterized through interpretable product attributes, item fixed effects, and a low-rank userโitem factor structure. To correct for exposure bias, inverse probability weighting (IPW) is incorporated into the framework. Innovatively, the method simultaneously handles selection bias and information sharing in preference learning and introduces an efficient optimization algorithm based on inverse probability resampling. Evaluated on real-world wine transaction data, the model substantially outperforms popularity-based baselines, demonstrating particularly strong performance in predicting usersโ first-time purchase behavior.
๐ Abstract
Estimating consumer preferences is central to many problems in economics and marketing. This paper develops a flexible framework for learning individual preferences from partial ranking information by interpreting observed rankings as collections of pairwise comparisons with logistic choice probabilities. We model latent utility as the sum of interpretable product attributes, item fixed effects, and a low-rank user-item factor structure, enabling both interpretability and information sharing across consumers and items. We further correct for selection in which comparisons are observed: a comparison is recorded only if both items enter the consumer's consideration set, inducing exposure bias toward frequently encountered items. We model pair observability as the product of item-level observability propensities and estimate these propensities with a logistic model for the marginal probability that an item is observable. Preference parameters are then estimated by maximizing an inverse-probability-weighted (IPW), ridge-regularized log-likelihood that reweights observed comparisons toward a target comparison population. To scale computation, we propose a stochastic gradient descent (SGD) algorithm based on inverse-probability resampling, which draws comparisons in proportion to their IPW weights. In an application to transaction data from an online wine retailer, the method improves out-of-sample recommendation performance relative to a popularity-based benchmark, with particularly strong gains in predicting purchases of previously unconsumed products.