Bayesian Neural Scaling Laws Extrapolation with Prior-Fitted Networks

📅 2025-05-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Neural scaling laws lack uncertainty quantification, hindering risk-aware decision-making during extrapolation. Method: We propose the first Bayesian meta-learning framework based on Prior-data Fitted Networks (PFNs) for scaling law modeling. Our approach innovatively leverages PFNs to synthesize principled, sampleable prior distributions over scaling functions—enabling probabilistic, physics-informed extrapolation. Contribution/Results: Compared to conventional point estimates and existing Bayesian methods, our framework achieves significantly improved extrapolation accuracy and predictive calibration on real-world scaling data. It attains state-of-the-art performance in low-data regimes—particularly in Bayesian active learning—demonstrating robustness where data is scarce. By providing well-calibrated uncertainty estimates, our method establishes a new paradigm for trustworthy deployment of scaling laws in safety-critical applications.

Technology Category

Application Category

📝 Abstract
Scaling has been a major driver of recent advancements in deep learning. Numerous empirical studies have found that scaling laws often follow the power-law and proposed several variants of power-law functions to predict the scaling behavior at larger scales. However, existing methods mostly rely on point estimation and do not quantify uncertainty, which is crucial for real-world applications involving decision-making problems such as determining the expected performance improvements achievable by investing additional computational resources. In this work, we explore a Bayesian framework based on Prior-data Fitted Networks (PFNs) for neural scaling law extrapolation. Specifically, we design a prior distribution that enables the sampling of infinitely many synthetic functions resembling real-world neural scaling laws, allowing our PFN to meta-learn the extrapolation. We validate the effectiveness of our approach on real-world neural scaling laws, comparing it against both the existing point estimation methods and Bayesian approaches. Our method demonstrates superior performance, particularly in data-limited scenarios such as Bayesian active learning, underscoring its potential for reliable, uncertainty-aware extrapolation in practical applications.
Problem

Research questions and friction points this paper is trying to address.

Predict neural scaling laws with uncertainty quantification
Meta-learn extrapolation using prior-fitted synthetic functions
Improve performance in data-limited scenarios like active learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian framework with Prior-Fitted Networks
Prior distribution for synthetic scaling laws
Meta-learns extrapolation with uncertainty quantification
🔎 Similar Papers
No similar papers found.