What Data Enables Optimal Decisions? An Exact Characterization for Linear Optimization

📅 2025-05-27

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the problem of data sufficiency for linear optimization under cost vector uncertainty: identifying the minimal dataset that uniquely determines the optimal decision. Methodologically, it introduces the first geometric sufficiency criterion for linear programming, grounded in convex geometry and duality theory, to characterize the critical cost directions governing optimality; it further establishes a modeling framework for uncertainty sets and designs a task-driven data selection algorithm. Theoretically, it proves the existence of a small-scale, structured minimal cost dataset sufficient to fully recover the optimal solution. This work provides rigorous theoretical guarantees and an efficient constructive procedure for task-aware data acquisition, overcoming key limitations of conventional sufficiency analyses—namely, their reliance on statistical assumptions or large-sample requirements.

Technology Category

Application Category

📝 Abstract

We study the fundamental question of how informative a dataset is for solving a given decision-making task. In our setting, the dataset provides partial information about unknown parameters that influence task outcomes. Focusing on linear programs, we characterize when a dataset is sufficient to recover an optimal decision, given an uncertainty set on the cost vector. Our main contribution is a sharp geometric characterization that identifies the directions of the cost vector that matter for optimality, relative to the task constraints and uncertainty set. We further develop a practical algorithm that, for a given task, constructs a minimal or least-costly sufficient dataset. Our results reveal that small, well-chosen datasets can often fully determine optimal decisions -- offering a principled foundation for task-aware data selection.

Problem

Research questions and friction points this paper is trying to address.

Characterizing dataset sufficiency for optimal linear decisions

Identifying critical cost vector directions for optimality

Developing algorithms for minimal sufficient dataset construction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Geometric characterization of cost vector directions

Algorithm for constructing minimal sufficient datasets

Task-aware data selection for optimal decisions

🔎 Similar Papers

Optimizer's Information Criterion: Dissecting and Correcting Bias in Data-Driven Optimization

2023-06-16arXiv.orgCitations: 4

Unsupervised Machine Learning Hybrid Approach Integrating Linear Programming in Loss Function: A Robust Optimization Technique

2024-08-19arXiv.orgCitations: 0

Authors to Follow