Adjusting Model Size in Continual Gaussian Processes: How Big is Big Enough?

📅 2024-08-14

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

In continual learning, the optimal model capacity (e.g., number of inducing points) for Gaussian processes is unknown a priori, and conventional approaches rely on dataset-specific statistics or manual tuning. Method: We propose the first adaptive capacity control method that requires no prior data statistics and depends solely on a single robust hyperparameter. It dynamically adds or removes inducing points online, guided by uncertainty-driven active expansion and a data-agnostic hyperparameter design. Contribution/Results: The method enables automatic, real-time model scaling as data streams arrive. On diverse continual learning benchmarks, it matches the predictive accuracy of fixed large-scale models while reducing computational cost significantly. Crucially, the same hyperparameter configuration generalizes across datasets—achieving, for the first time, theoretically grounded near-optimal performance, computational efficiency, and cross-dataset robustness simultaneously.

Technology Category

Application Category

📝 Abstract

Many machine learning models require setting a parameter that controls their size before training, e.g.~number of neurons in DNNs, or inducing points in GPs. Increasing capacity typically improves performance until all the information from the dataset is captured. After this point, computational cost keeps increasing without improved performance. This leads to the question ``How big is big enough?'' We investigate this problem for Gaussian processes (single-layer neural networks) in continual learning. Here, data becomes available incrementally, and the final dataset size will therefore not be known before training, preventing the use of heuristics for setting a fixed model size. We develop a method to automatically adjust model size while maintaining near-optimal performance. Our experimental procedure follows the constraint that any hyperparameters must be set without seeing dataset properties. For our method, a single hyperparameter setting works well across diverse datasets, showing that it requires less tuning compared to others.

Problem

Research questions and friction points this paper is trying to address.

Determining optimal model size in continual Gaussian processes

Automatically adjusting model size without prior dataset knowledge

Maintaining performance while minimizing computational cost

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automatically adjusts Gaussian process model size

Maintains near-optimal performance incrementally

Requires no hyperparameter tuning across datasets

🔎 Similar Papers

No similar papers found.