🤖 AI Summary
This work proposes a diffusion-based distributionally robust decision-focused learning framework (3D-Learning) to address the significant degradation in downstream decision performance caused by out-of-distribution (OOD) samples in predict-then-optimize pipelines. By integrating diffusion models into distributionally robust optimization (DRO), the method leverages their powerful generative capabilities to construct more realistic and challenging worst-case distributions within the parameterized space. This enables the training of predictive models that balance performance under both average and worst-case scenarios. The framework is fully end-to-end trainable and demonstrates superior performance over existing DRO and data augmentation approaches in large language model resource allocation tasks, achieving stronger OOD decision generalization.
📝 Abstract
Predict-then-Optimize (PTO) pipelines are widely employed in computing and networked systems, where Machine Learning (ML) models are used to predict critical contextual information for downstream decision-making tasks such as cloud LLM serving, data center demand response, and edge workload scheduling. However, these ML predictors are often vulnerable to out-of-distribution (OOD) samples at test time, leading to significant decision performance degradation due to large prediction errors. To address the generalization challenges under OOD conditions, we present the framework of Distributionally Robust Decision-Focused Learning (DR-DFL), which trains ML models to optimize decision performance under the worst-case distribution. Instead of relying on classical Distributionally Robust Optimization (DRO) techniques, we propose Diffusion-Augmented Distributionally Robust Decision-Focused Learning (3D-Learning), which searches for the worst-case distribution within the parameterized space of a diffusion model. By leveraging the powerful distribution modeling capabilities of diffusion models, 3D-Learning identifies worst-case distributions that remain consistent with real data, achieving a favorable balance between average and worst-case scenarios. Empirical results on an LLM resource provisioning task demonstrate that 3D-Learning outperforms existing DRO and Data Augmentation methods in OOD generalization performance.