🤖 AI Summary
Existing deep learning models lack reliable quantification of epistemic uncertainty—particularly “unknown unknowns”—in safety-critical applications such as autonomous driving.
Method: We propose the first framework embedding random set theory into deep neural networks, replacing conventional probabilistic outputs with belief functions defined over set-valued predictions. Epistemic uncertainty arising from limited training data is explicitly modeled via the size of the convex probability set induced by the belief function’s focal elements. The approach integrates mainstream backbones—including WideResNet, VGG, Inception, EfficientNet, and ViT—with conformal prediction to yield interpretable and well-calibrated confidence estimates.
Results: On CIFAR, MNIST, and ImageNet benchmarks, our method significantly outperforms Bayesian and ensemble baselines in out-of-distribution detection, adversarial robustness, and statistical coverage guarantees, achieving state-of-the-art performance across all metrics.
📝 Abstract
Machine learning is increasingly deployed in safety-critical domains where erroneous predictions may lead to potentially catastrophic consequences, highlighting the need for learning systems to be aware of how confident they are in their own predictions: in other words, 'to know when they do not know'. In this paper, we propose a novel Random-Set Neural Network (RS-NN) approach to classification which predicts belief functions (rather than classical probability vectors) over the class list using the mathematics of random sets, i.e., distributions over the collection of sets of classes. RS-NN encodes the 'epistemic' uncertainty induced by training sets that are insufficiently representative or limited in size via the size of the convex set of probability vectors associated with a predicted belief function. Our approach outperforms state-of-the-art Bayesian and Ensemble methods in terms of accuracy, uncertainty estimation and out-of-distribution (OoD) detection on multiple benchmarks (CIFAR-10 vs SVHN/Intel-Image, MNIST vs FMNIST/KMNIST, ImageNet vs ImageNet-O). RS-NN also scales up effectively to large-scale architectures (e.g. WideResNet-28-10, VGG16, Inception V3, EfficientNetB2 and ViT-Base-16), exhibits remarkable robustness to adversarial attacks and can provide statistical guarantees in a conformal learning setting.