🤖 AI Summary
In data-driven optimization, decision samples often exhibit optimistic bias relative to true performance due to the “optimizer’s curse.” To address this, we propose a first-order bias correction method that avoids re-optimization. We introduce the Optimizer’s Information Criterion (OIC), the first information-theoretic criterion tailored for decision selection in data-driven optimization—generalizing the Akaike Information Criterion (AIC) to encompass empirical models, parametric models, regularization, and contextual optimization. Leveraging asymptotic statistical analysis, we derive an analytical bias expression that explicitly captures the coupling between optimization and learning, eliminating the need for cross-validation. Evaluated on both synthetic and real-world datasets, our method achieves more accurate bias estimation and significantly lower computational overhead, while providing rigorous theoretical guarantees.
📝 Abstract
In data-driven optimization, the sample performance of the obtained decision typically incurs an optimistic bias against the true performance, a phenomenon commonly known as the Optimizer's Curse and intimately related to overfitting in machine learning. Common techniques to correct this bias, such as cross-validation, require repeatedly solving additional optimization problems and are therefore computationally expensive. We develop a general bias correction approach, building on what we call Optimizer's Information Criterion (OIC), that directly approximates the first-order bias and does not require solving any additional optimization problems. Our OIC generalizes the celebrated Akaike Information Criterion to evaluate the objective performance in data-driven optimization, which crucially involves not only model fitting but also its interplay with the downstream optimization. As such it can be used for decision selection instead of only model selection. We apply our approach to a range of data-driven optimization formulations comprising empirical and parametric models, their regularized counterparts, and furthermore contextual optimization. Finally, we provide numerical validation on the superior performance of our approach under synthetic and real-world datasets.