🤖 AI Summary
Position bias in implicit feedback data severely degrades learning-to-rank (LTR) performance. To address this, we propose a two-stage debiasing framework grounded in control function methodology: in the first stage, ranking residuals are leveraged to construct exogenous instrumental variables; in the second stage, these instruments enable unbiased estimation of true relevance within a click model. Crucially, our approach imposes no parametric assumptions on either the click or propensity models, supports arbitrary nonlinear ranking models and state-of-the-art LTR algorithms, and—uniquely—introduces validation-set click debiasing to facilitate unbiased hyperparameter tuning. Evaluated on multiple standard LTR benchmarks, our method achieves consistent NDCG@10 improvements of 3.2–5.7% over strong baselines. Moreover, even without access to unbiased validation data, it reliably selects the optimal model, outperforming existing debiasing methods by a significant margin.
📝 Abstract
Implicit feedback data, such as user clicks, is commonly used in learning-to-rank (LTR) systems because it is easy to collect and it often reflects user preferences. However, this data is prone to various biases, and training an LTR system directly on biased data can result in suboptimal ranking performance. One of the most prominent and well-studied biases in implicit feedback data is position bias, which occurs because users are more likely to interact with higher-ranked documents regardless of their true relevance. In this paper, we propose a novel control function-based method that accounts for position bias in a two-stage process. The first stage uses exogenous variation from the residuals of the ranking process to correct for position bias in the second stage click equation. Unlike previous position bias correction methods, our method does not require knowledge of the click or propensity model and allows for nonlinearity in the underlying ranking model. Moreover, our method is general and allows for debiasing any state-of-the-art ranking algorithm by plugging it into the second stage. We also introduce a technique to debias validation clicks for hyperparameter tuning to select the optimal model in the absence of unbiased validation data. Experimental results demonstrate that our method outperforms state-of-the-art approaches in correcting for position bias.