Estimate-Then-Optimize versus Integrated-Estimation-Optimization versus Sample Average Approximation: A Stochastic Dominance Perspective

📅 2023-04-13
📈 Citations: 6
Influential: 1
📄 PDF
🤖 AI Summary
This paper investigates the coupling between parameter estimation and decision optimization in data-driven stochastic optimization, focusing on how model misspecification fundamentally affects the performance of three paradigms: Estimate-Then-Optimize (ETO), Integrated Estimation-and-Optimization (IEO), and Sample Average Approximation (SAA). Using stochastic dominance—operating at the full-distribution level—we establish, for the first time, that under correct model specification (i.e., when the postulated model class contains the true data-generating distribution), ETO strictly dominates both IEO and SAA; conversely, under misspecification, IEO and SAA dominate ETO, with SAA being asymptotically optimal. By integrating asymptotic statistics, regret analysis, and scenario-based constrained modeling—and corroborating findings via finite-sample experiments—we precisely characterize the theoretical threshold governing performance reversal among the three methods and validate its robustness under practical sample sizes.
📝 Abstract
In data-driven stochastic optimization, model parameters of the underlying distribution need to be estimated from data in addition to the optimization task. Recent literature considers integrating the estimation and optimization processes by selecting model parameters that lead to the best empirical objective performance. This integrated approach, which we call integrated-estimation-optimization (IEO), can be readily shown to outperform simple estimate-then-optimize (ETO) when the model is misspecified. In this paper, we show that a reverse behavior appears when the model class is well-specified and there is sufficient data. Specifically, for a general class of nonlinear stochastic optimization problems, we show that simple ETO outperforms IEO asymptotically when the model class covers the ground truth, in the strong sense of stochastic dominance of the regret. Namely, the entire distribution of the regret, not only its mean or other moments, is always better for ETO compared to IEO. Our results also apply to constrained, contextual optimization problems where the decision depends on observed features. Whenever applicable, we also demonstrate how standard sample average approximation (SAA) performs the worst when the model class is well-specified in terms of regret, and best when it is misspecified. Finally, we provide experimental results to support our theoretical comparisons and illustrate when our insights hold in finite-sample regimes and under various degrees of misspecification.
Problem

Research questions and friction points this paper is trying to address.

Compare ETO, IEO, SAA methods in stochastic optimization performance
Show ETO outperforms IEO when model is well-specified
Demonstrate SAA performs worst when model is well-specified
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrated-estimation-optimization outperforms estimate-then-optimize
Estimate-then-optimize excels with well-specified models
Sample average approximation varies by model specification
🔎 Similar Papers
No similar papers found.
Adam N. Elmachtoub
Adam N. Elmachtoub
Columbia University, Dept. of Industrial Engineering and Operations Research
operations researchmachine learningpricinglogisticsoptimization
H
H. Lam
Department of Industrial Engineering and Operations Research, Columbia University, New York, NY 10027
H
Haofeng Zhang
Department of Industrial Engineering and Operations Research, Columbia University, New York, NY 10027
Y
Yunfan Zhao
Department of Industrial Engineering and Operations Research, Columbia University, New York, NY 10027