🤖 AI Summary
Many real-world decision-making problems exhibit multi-stage structures, temporal dependencies, and intertemporal effects—characteristics inadequately addressed by prevailing decision-focused learning methods, which predominantly assume single-stage settings. To bridge this gap, we propose the first decision-oriented predictive framework explicitly designed for multi-stage optimization: it jointly trains predictive models and dynamic decision policies in an end-to-end differentiable manner, explicitly encoding temporal decision dependencies via differentiable optimization. Furthermore, we introduce an implicit multi-layer recurrent architecture that captures feedback from state trajectories to predictions, and theoretically establish its gradient-based self-correcting mechanism for prediction bias. Evaluated on an energy storage arbitrage task, our method substantially outperforms both decoupled predict-then-optimize baselines and single-stage counterparts, achieving a 12.7% improvement in long-horizon cumulative reward.
📝 Abstract
Decision-focused learning has emerged as a promising approach for decision making under uncertainty by training the upstream predictive aspect of the pipeline with respect to the quality of the downstream decisions. Most existing work has focused on single stage problems. Many real-world decision problems are more appropriately modelled using multistage optimisation as contextual information such as prices or demand is revealed over time and decisions now have a bearing on future decisions. We propose decision-focused forecasting, a multiple-implicitlayer model which in its training accounts for the intertemporal decision effects of forecasts using differentiable optimisation. The recursive model reflects a fully differentiable multistage optimisation approach. We present an analysis of the gradients produced by this model showing the adjustments made to account for the state-path caused by forecasting. We demonstrate an application of the model to an energy storage arbitrage task and report that our model outperforms existing approaches.