Optimizer's Information Criterion: Dissecting and Correcting Bias in Data-Driven Optimization

📅 2023-06-16
🏛️ arXiv.org
📈 Citations: 4
Influential: 1
📄 PDF
🤖 AI Summary
In data-driven optimization, decision samples often exhibit optimistic bias relative to true performance due to the “optimizer’s curse.” To address this, we propose a first-order bias correction method that avoids re-optimization. We introduce the Optimizer’s Information Criterion (OIC), the first information-theoretic criterion tailored for decision selection in data-driven optimization—generalizing the Akaike Information Criterion (AIC) to encompass empirical models, parametric models, regularization, and contextual optimization. Leveraging asymptotic statistical analysis, we derive an analytical bias expression that explicitly captures the coupling between optimization and learning, eliminating the need for cross-validation. Evaluated on both synthetic and real-world datasets, our method achieves more accurate bias estimation and significantly lower computational overhead, while providing rigorous theoretical guarantees.
📝 Abstract
In data-driven optimization, the sample performance of the obtained decision typically incurs an optimistic bias against the true performance, a phenomenon commonly known as the Optimizer's Curse and intimately related to overfitting in machine learning. Common techniques to correct this bias, such as cross-validation, require repeatedly solving additional optimization problems and are therefore computationally expensive. We develop a general bias correction approach, building on what we call Optimizer's Information Criterion (OIC), that directly approximates the first-order bias and does not require solving any additional optimization problems. Our OIC generalizes the celebrated Akaike Information Criterion to evaluate the objective performance in data-driven optimization, which crucially involves not only model fitting but also its interplay with the downstream optimization. As such it can be used for decision selection instead of only model selection. We apply our approach to a range of data-driven optimization formulations comprising empirical and parametric models, their regularized counterparts, and furthermore contextual optimization. Finally, we provide numerical validation on the superior performance of our approach under synthetic and real-world datasets.
Problem

Research questions and friction points this paper is trying to address.

Correcting optimistic bias in data-driven optimization decisions
Reducing computational cost of bias correction methods
Generalizing Akaike Information Criterion for optimization performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Develops Optimizer's Information Criterion (OIC)
Directly approximates first-order bias
Avoids solving additional optimization problems
🔎 Similar Papers
No similar papers found.
G
Garud Iyengar
Department of Industrial Engineering and Operations Research, Columbia University, New York, NY 10027
H
Henry Lam
Department of Industrial Engineering and Operations Research, Columbia University, New York, NY 10027
T
Tianyu Wang
Department of Industrial Engineering and Operations Research, Columbia University, New York, NY 10027