🤖 AI Summary
In recommender systems, randomized experiments on the creator side suffer from severe bias in treatment effect estimation due to exposure competition among contents, rendering conventional difference-in-means estimators invalid. To address this, we propose the first microfoundation-driven recommendation selection model, integrating structured discrete choice theory with deep neural networks to jointly capture audience- and content-level heterogeneity and the structural exposure competition within a shared impression pool. Building upon this, we design a doubly debiased estimator embedded within a blocked double-sided randomization experimental framework, enabling consistent, interference-free causal effect estimation with asymptotic normality guarantees. Empirically evaluated on WeChat’s short-video platform, our method substantially reduces estimation bias—including correction of sign reversals—and markedly improves the reliability of causal inference compared to standard difference-in-means estimation.
📝 Abstract
Recommender systems are essential for content-sharing platforms by curating personalized content. To evaluate updates of recommender systems targeting content creators, platforms frequently engage in creatorside randomized experiments to assess their performance. These experiments help estimate treatment effects, defined as the difference in outcomes when a new (vs. the status quo) algorithm is deployed on the platform. We show that the standard difference-in-means estimator can lead to a biased treatment effect estimate. This bias can occur in either direction and may even cause a reversal in the estimated sign, leading to incorrect decision-making. This bias arises because of recommender interference, which occurs when treated and control creators compete for exposure through the recommender system. We propose a "recommender choice model" that captures how an item is chosen among a pool comprised of both treated and control content items. By combining a structural choice model with neural networks, the framework directly models the interference pathway in a microfounded way while accounting for rich viewer-content heterogeneity. This model enables counterfactual evaluations of treatment effects under alternative treatment assignments (e.g., all treated or all control) for both the entire population and specific subgroups. Within this modeling framework, we further construct a double/debiased estimator of the treatment effect that is consistent and asymptotically normal under regularity conditions. We demonstrate its empirical performance with a field experiment on Weixin short-video platform. Besides the standard creator-side experiment, we implement a costly blocked double-sided randomization design to obtain a benchmark estimate of the treatment effect without interference bias. We show that the proposed estimator significantly reduces bias in treatment effect estimates compared to the standard difference-in-means estimator. The full paper is available at https://arxiv.org/abs/2406.14380.