Pointwise confidence estimation in the non-linear $ell^2$-regularized least squares

📅 2025-06-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses pointwise confidence estimation for nonlinear ℓ²-regularized least squares in the finite-sample regime, aiming to provide high-probability, non-asymptotic confidence intervals for predictions at any fixed test input (x), with interval width adaptively reflecting (x)’s similarity to training data in the implicit feature space (e.g., widening automatically when (x) lies far from the training set). Methodologically, we derive the first non-asymptotic pointwise confidence bounds for local minima, introducing an inverse-Hessian-weighted norm to characterize generalization uncertainty—generalizing classical covariance structures from linear statistics. By integrating implicit feature-space analysis with efficient inverse-Hessian approximation, our method achieves computational complexity only slightly higher than a single gradient evaluation. Theoretical guarantees are strong and non-asymptotic. Empirically, our approach outperforms bootstrap-based methods in the trade-off between coverage accuracy and interval width.

Technology Category

Application Category

📝 Abstract
We consider a high-probability non-asymptotic confidence estimation in the $ell^2$-regularized non-linear least-squares setting with fixed design. In particular, we study confidence estimation for local minimizers of the regularized training loss. We show a pointwise confidence bound, meaning that it holds for the prediction on any given fixed test input $x$. Importantly, the proposed confidence bound scales with similarity of the test input to the training data in the implicit feature space of the predictor (for instance, becoming very large when the test input lies far outside of the training data). This desirable last feature is captured by the weighted norm involving the inverse-Hessian matrix of the objective function, which is a generalized version of its counterpart in the linear setting, $x^{ op} ext{Cov}^{-1} x$. Our generalized result can be regarded as a non-asymptotic counterpart of the classical confidence interval based on asymptotic normality of the MLE estimator. We propose an efficient method for computing the weighted norm, which only mildly exceeds the cost of a gradient computation of the loss function. Finally, we complement our analysis with empirical evidence showing that the proposed confidence bound provides better coverage/width trade-off compared to a confidence estimation by bootstrapping, which is a gold-standard method in many applications involving non-linear predictors such as neural networks.
Problem

Research questions and friction points this paper is trying to address.

Estimating confidence bounds for non-linear least squares minimizers
Scaling confidence with test input similarity to training data
Efficient computation of weighted norms for confidence estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Non-linear $ell^2$-regularized least squares confidence estimation
Pointwise confidence bound using inverse-Hessian weighted norm
Efficient computation method for confidence bounds
🔎 Similar Papers
No similar papers found.
Ilja Kuzborskij
Ilja Kuzborskij
Google DeepMind
Machine LearningLearning Theory
Y
Yasin Abbari Yadkori
Sapient Intelligence