🤖 AI Summary
Deep learning models face deployment risks in high-stakes applications due to opaque decision-making and poorly quantified uncertainty. To address this, we propose a training-free, unsupervised, model-agnostic abstention mechanism that integrates uncertainty awareness via prompt engineering and multi-path sampling. It generates the original prediction alongside two alternative predictions, forming a three-way self-consistency criterion. Crucially, we formalize human sensitivity to conflicting evidence as a joint gating mechanism combining majority voting and confidence divergence—dynamically triggering abstention when consensus is weak. Evaluated on ImageNet, out-of-distribution detection, and medical diagnosis benchmarks, our method reduces average error rate by 37%, achieves task coverage exceeding 92%, and attains 94.6% abstention accuracy. It significantly enhances robustness and safety in safety-critical domains without requiring model retraining or access to ground-truth uncertainty labels.
📝 Abstract
Deep learning (DL) can automatically construct intelligent agents, deep neural networks (alternatively, DL models), that can outperform humans in certain tasks. However, the operating principles of DL remain poorly understood, making its decisions incomprehensible. As a result, it poses a great risk to deploy DL in high-stakes domains in which mistakes or errors may lead to critical consequences. Here, we aim to develop an algorithm that can help DL models make more robust decisions by allowing them to abstain from answering when they are uncertain. Our algorithm, named `Two out of Three (ToT)', is inspired by the sensitivity of the human brain to conflicting information. ToT creates two alternative predictions in addition to the original model prediction and uses the alternative predictions to decide whether it should provide an answer or not.