Scalable Linearized Laplace Approximation via Surrogate Neural Kernel

📅 2026-01-29

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This work addresses the scalability bottleneck of linearized Laplace approximation (LLA) in large-scale pretrained models, which arises from the explicit computation of prohibitively large Jacobian matrices. To overcome this limitation, the authors propose an efficient alternative based on a learnable surrogate neural network that learns compact feature representations whose inner products approximate the neural tangent kernel (NTK). By relying solely on efficient Jacobian-vector products, the method avoids constructing the full Jacobian matrix explicitly. Innovatively, the approach introduces a biased yet learnable surrogate kernel, which not only substantially improves computational efficiency but also enhances out-of-distribution detection performance, thereby moving beyond the traditional LLA’s reliance on fixed kernels. Experimental results demonstrate that the proposed method achieves superior out-of-distribution detection and scalability on large models while maintaining or even improving uncertainty calibration.

Technology Category

Application Category

📝 Abstract

We introduce a scalable method to approximate the kernel of the Linearized Laplace Approximation (LLA). For this, we use a surrogate deep neural network (DNN) that learns a compact feature representation whose inner product replicates the Neural Tangent Kernel (NTK). This avoids the need to compute large Jacobians. Training relies solely on efficient Jacobian-vector products, allowing to compute predictive uncertainty on large-scale pre-trained DNNs. Experimental results show similar or improved uncertainty estimation and calibration compared to existing LLA approximations. Notwithstanding, biasing the learned kernel significantly enhances out-of-distribution detection. This remarks the benefits of the proposed method for finding better kernels than the NTK in the context of LLA to compute prediction uncertainty given a pre-trained DNN.

Problem

Research questions and friction points this paper is trying to address.

Linearized Laplace Approximation

Neural Tangent Kernel

predictive uncertainty

scalable approximation

out-of-distribution detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Linearized Laplace Approximation

Neural Tangent Kernel

Surrogate Neural Network