Simulating classification models to evaluate Predict-Then-Optimize methods

📅 2025-09-02

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

This paper challenges the unverified implicit assumption in the predict-then-optimize paradigm that “higher prediction accuracy necessarily yields better downstream decisions,” particularly in multiclass classification settings. Method: We propose a controllable, interpretable multiclass prediction simulation framework that explicitly models error types and distributions, enabling systematic analysis of how classification errors affect decision quality in constrained optimization. Contribution/Results: Experiments on job scheduling and other combinatorial optimization tasks reveal a nonlinear relationship between prediction error and decision performance: improving prediction accuracy does not guarantee improved solution quality—and can even degrade decisions when error patterns shift. Our findings question the conventional coupling logic between prediction and optimization, providing theoretical foundations and practical guidance for designing, evaluating, and calibrating classifiers specifically tailored to decision objectives.

Technology Category

Application Category

📝 Abstract

Uncertainty in optimization is often represented as stochastic parameters in the optimization model. In Predict-Then-Optimize approaches, predictions of a machine learning model are used as values for such parameters, effectively transforming the stochastic optimization problem into a deterministic one. This two-stage framework is built on the assumption that more accurate predictions result in solutions that are closer to the actual optimal solution. However, providing evidence for this assumption in the context of complex, constrained optimization problems is challenging and often overlooked in the literature. Simulating predictions of machine learning models offers a way to (experimentally) analyze how prediction error impacts solution quality without the need to train real models. Complementing an algorithm from the literature for simulating binary classification, we introduce a new algorithm for simulating predictions of multiclass classifiers. We conduct a computational study to evaluate the performance of these algorithms, and show that classifier performance can be simulated with reasonable accuracy, although some variability is observed. Additionally, we apply these algorithms to assess the performance of a Predict-Then-Optimize algorithm for a machine scheduling problem. The experiments demonstrate that the relationship between prediction error and how close solutions are to the actual optimum is non-trivial, highlighting important considerations for the design and evaluation of decision-making systems based on machine learning predictions.

Problem

Research questions and friction points this paper is trying to address.

Evaluating how prediction error affects optimization solution quality

Simulating multiclass classifier predictions for experimental analysis

Assessing Predict-Then-Optimize performance in machine scheduling problems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Simulating multiclass classifier predictions algorithm

Evaluating prediction error impact on solution quality

Assessing Predict-Then-Optimize for scheduling problems

🔎 Similar Papers

OCCAM: Towards Cost-Efficient and Accuracy-Aware Image Classification Inference