Deep Discriminative to Kernel Density Graph for In- and Out-of-distribution Calibrated Inference

📅 2022-01-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the confidence miscalibration of deep learning models in safety-critical settings—particularly overconfidence on out-of-distribution (OOD) data relative to in-distribution (ID) data—this paper proposes a geometry-guided unified calibration framework. It systematically replaces the affine decision functions embedded in the polyhedral partition structures of discriminative models (e.g., DNNs, random forests) with Gaussian kernel density estimators, thereby constructing kernel-based confidence maps that generalize robustly to OOD regions. This is the first approach to jointly model ID and OOD calibration, preserving or even improving ID classification accuracy while substantially mitigating OOD overconfidence. Experiments across tabular and image benchmarks demonstrate consistent gains: expected calibration error (ECE) reductions of 23–41%, and OOD detection AUROC improvements of 5.2–9.8 percentage points—outperforming state-of-the-art calibration and OOD detection baselines across all metrics.
📝 Abstract
Deep discriminative approaches like random forests and deep neural networks have recently found applications in many important real-world scenarios. However, deploying these learning algorithms in safety-critical applications raises concerns, particularly when it comes to ensuring confidence calibration for both in-distribution and out-of-distribution data points. Many popular methods for in-distribution (ID) calibration, such as isotonic and Platt's sigmoidal regression, exhibit excellent ID calibration performance. However, these methods are not calibrated for the entire feature space, leading to overconfidence in the case of out-of-distribution (OOD) samples. On the other end of the spectrum, existing out-of-distribution (OOD) calibration methods generally exhibit poor in-distribution (ID) calibration. In this paper, we address ID and OOD calibration problems jointly. We leveraged the fact that deep models, including both random forests and deep-nets, learn internal representations which are unions of polytopes with affine activation functions to conceptualize them both as partitioning rules of the feature space. We replace the affine function in each polytope populated by the training data with a Gaussian kernel. Our experiments on both tabular and vision benchmarks show that the proposed approaches obtain well-calibrated posteriors while mostly preserving or improving the classification accuracy of the original algorithm for ID region, and extrapolate beyond the training data to handle OOD inputs appropriately.
Problem

Research questions and friction points this paper is trying to address.

Ensuring calibration for in-distribution and out-of-distribution regions
Addressing overconfidence in out-of-distribution predictions
Improving calibration without sacrificing classification accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Geodesic distance measures polytope distances
Gaussian kernel distinguishes same polytope samples
Kernel Density methods ensure ID and OOD calibration
🔎 Similar Papers
No similar papers found.