Amortized Spectral Kernel Discovery via Prior-Data Fitted Network

📅 2026-01-29

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the limitation of Prior-Data Fitted Networks (PFNs)—namely, their lack of explicit access to learned priors and kernel functions, which hinders their applicability in downstream tasks requiring well-defined covariance structures. Through a mechanistic analysis, the authors uncover an intrinsic connection between attention outputs in PFNs and spectral structure, enabling the design of a novel decoder architecture that, for the first time, efficiently and interpretably extracts explicit spectral densities and stationary kernel functions from pretrained PFNs. By integrating disentangled attention, Bochner’s theorem, and spectral density estimation, the method achieves high-fidelity Gaussian process modeling in a single forward pass, accurately recovering complex multimodal spectral structures. The resulting kernels match the regression performance of both PFNs and optimized baselines while offering inference speeds several orders of magnitude faster.

Technology Category

Application Category

📝 Abstract

Prior-Data Fitted Networks (PFNs) enable efficient amortized inference but lack transparent access to their learned priors and kernels. This opacity hinders their use in downstream tasks, such as surrogate-based optimization, that require explicit covariance models. We introduce an interpretability-driven framework for amortized spectral discovery from pre-trained PFNs with decoupled attention. We perform a mechanistic analysis on a trained PFN that identifies attention latent output as the key intermediary, linking observed function data to spectral structure. Building on this insight, we propose decoder architectures that map PFN latents to explicit spectral density estimates and corresponding stationary kernels via Bochner's theorem. We study this pipeline in both single-realization and multi-realization regimes, contextualizing theoretical limits on spectral identifiability and proving consistency when multiple function samples are available. Empirically, the proposed decoders recover complex multi-peak spectral mixtures and produce explicit kernels that support Gaussian process regression with accuracy comparable to PFNs and optimization-based baselines, while requiring only a single forward pass. This yields orders-of-magnitude reductions in inference time compared to optimization-based baselines.

Problem

Research questions and friction points this paper is trying to address.

Prior-Data Fitted Networks

amortized inference

spectral kernel discovery

interpretable priors

explicit covariance models

Innovation

Methods, ideas, or system contributions that make the work stand out.

amortized inference

spectral kernel discovery

Prior-Data Fitted Networks

Bochner's theorem

interpretable deep learning

🔎 Similar Papers

No similar papers found.

Authors to Follow