🤖 AI Summary
This work addresses the limited robustness of traditional empirical risk minimization under distributional shift and its inability to adequately capture tail behavior of the loss distribution. The authors propose a novel stochastic set-valued optimization framework based on hyper-box sets, wherein decision variables are mapped to hyper-boxes and the problem is reformulated as a multi-objective optimization. A key innovation lies in jointly modeling the lower and upper tails of the loss distribution via sub-quantiles and super-quantiles. The resulting formulation is solved using a stochastic multi-gradient algorithm, coupled with a Pareto knee-point selection strategy. This approach significantly enhances model robustness and test-time stability under distributional shifts while maintaining accuracy comparable to that of empirical risk minimization.
📝 Abstract
In this paper, we develop a stochastic set-valued optimization (SVO) framework tailored for robust machine learning. In the SVO setting, each decision variable is mapped to a set of objective values, and optimality is defined via set relations. We focus on SVO problems with hyperbox sets, which can be reformulated as multi-objective optimization (MOO) problems with finitely many objectives and serve as a foundation for representing or approximating more general mapped sets. Two special cases of hyperbox-valued optimization (HVO) are interval-valued (IVO) and rectangle-valued (RVO) optimization. We construct stochastic IVO/RVO formulations that incorporate subquantiles and superquantiles into the objective functions of the MOO reformulations, providing a new characterization for subquantiles. These formulations provide interpretable trade-offs by capturing both lower- and upper-tail behaviors of loss distributions, thereby going beyond standard empirical risk minimization and classical robust models. To solve the resulting multi-objective problems, we adopt stochastic multi-gradient algorithms and select a Pareto knee solution. In numerical experiments, the proposed algorithms with this selection strategy exhibit improved robustness and reduced variability across test replications under distributional shift compared with empirical risk minimization, while maintaining competitive accuracy.