🤖 AI Summary
Deep learning models often produce overconfident yet unreliable predictions on out-of-support (OoS) samples. This work proposes WeightCaster, a novel framework that reframes OoS generalization as a sequence prediction problem in weight space. By partitioning training data into concentric shells treated as discrete time steps, WeightCaster enables interpretable, uncertainty-aware predictions without requiring explicit inductive biases. The approach achieves a favorable balance between computational efficiency and reliability, matching or surpassing state-of-the-art methods on both synthetic cosine datasets and real-world air quality sensor data. Consequently, it significantly enhances model robustness and trustworthiness in out-of-support scenarios.
📝 Abstract
As breakthroughs in deep learning transform key industries, models are increasingly required to extrapolate on datapoints found outside the range of the training set, a challenge we coin as out-of-support (OoS) generalisation. However, neural networks frequently exhibit catastrophic failure on OoS samples, yielding unrealistic but overconfident predictions. We address this challenge by reformulating the OoS generalisation problem as a sequence modelling task in the weight space, wherein the training set is partitioned into concentric shells corresponding to discrete sequential steps. Our WeightCaster framework yields plausible, interpretable, and uncertainty-aware predictions without necessitating explicit inductive biases, all the while maintaining high computational efficiency. Emprical validation on a synthetic cosine dataset and real-world air quality sensor readings demonstrates performance competitive or superior to the state-of-the-art. By enhancing reliability beyond in-distribution scenarios, these results hold significant implications for the wider adoption of artificial intelligence in safety-critical applications.