🤖 AI Summary
This paper addresses volume (length) minimization of prediction intervals in split conformal regression, targeting approximation of the theoretically shortest interval—not merely nominal coverage. We propose two novel methods, EffOrt and Ad-EffOrt, which—uniquely—establish a theoretical linkage between the learning and calibration steps for volume optimization, enabling covariate-adaptive interval length adjustment. Our approach integrates empirical volume minimization, function class selection analysis, generalization bound derivation, and adaptive quantile regression. We derive finite-sample upper bounds on excess volume loss. Experiments demonstrate that our methods significantly reduce prediction interval length while strictly maintaining nominal coverage, and exhibit strong robustness to model misspecification and distribution shift.
📝 Abstract
We study the question of volume optimality in split conformal regression, a topic still poorly understood in comparison to coverage control. Using the fact that the calibration step can be seen as an empirical volume minimization problem, we first derive a finite-sample upper-bound on the excess volume loss of the interval returned by the classical split method. This important quantity measures the difference in length between the interval obtained with the split method and the shortest oracle prediction interval. Then, we introduce EffOrt, a methodology that modifies the learning step so that the base prediction function is selected in order to minimize the length of the returned intervals. In particular, our theoretical analysis of the excess volume loss of the prediction sets produced by EffOrt reveals the links between the learning and calibration steps, and notably the impact of the choice of the function class of the base predictor. We also introduce Ad-EffOrt, an extension of the previous method, which produces intervals whose size adapts to the value of the covariate. Finally, we evaluate the empirical performance and the robustness of our methodologies.