What Data Enables Optimal Decisions? An Exact Characterization for Linear Optimization

๐Ÿ“… 2025-05-27
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the problem of data sufficiency for linear optimization under cost vector uncertainty: identifying the minimal dataset that uniquely determines the optimal decision. Methodologically, it introduces the first geometric sufficiency criterion for linear programming, grounded in convex geometry and duality theory, to characterize the critical cost directions governing optimality; it further establishes a modeling framework for uncertainty sets and designs a task-driven data selection algorithm. Theoretically, it proves the existence of a small-scale, structured minimal cost dataset sufficient to fully recover the optimal solution. This work provides rigorous theoretical guarantees and an efficient constructive procedure for task-aware data acquisition, overcoming key limitations of conventional sufficiency analysesโ€”namely, their reliance on statistical assumptions or large-sample requirements.

Technology Category

Application Category

๐Ÿ“ Abstract
We study the fundamental question of how informative a dataset is for solving a given decision-making task. In our setting, the dataset provides partial information about unknown parameters that influence task outcomes. Focusing on linear programs, we characterize when a dataset is sufficient to recover an optimal decision, given an uncertainty set on the cost vector. Our main contribution is a sharp geometric characterization that identifies the directions of the cost vector that matter for optimality, relative to the task constraints and uncertainty set. We further develop a practical algorithm that, for a given task, constructs a minimal or least-costly sufficient dataset. Our results reveal that small, well-chosen datasets can often fully determine optimal decisions -- offering a principled foundation for task-aware data selection.
Problem

Research questions and friction points this paper is trying to address.

Characterizing dataset sufficiency for optimal linear decisions
Identifying critical cost vector directions for optimality
Developing algorithms for minimal sufficient dataset construction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Geometric characterization of cost vector directions
Algorithm for constructing minimal sufficient datasets
Task-aware data selection for optimal decisions