AUCp: Pseudo-AUC for Inference Model Selection with Unlabeled Validation Data in Abnormality Detection

📅 2026-06-07

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the long-standing challenge of model selection in anomaly detection under scenarios lacking labeled validation data, where existing approaches rely heavily on reconstruction quality or manual annotations. The authors propose AUCp, a fully unsupervised model selection metric that treats the entire unlabeled test set as positive (i.e., anomalous) and computes the Area Under the ROC Curve (AUC) accordingly. This approach enables, for the first time, completely annotation-free model selection across diverse unsupervised and self-supervised anomaly detection frameworks. Evaluated on medical image reconstruction tasks involving multiple neurological disorders and heterogeneous datasets, AUCp consistently outperforms conventional metrics, effectively identifying the best-performing inference models and thereby enhancing overall anomaly detection performance.

📝 Abstract

Abnormality detection is a crucial yet challenging task in medical image analysis. Distinguishing abnormalities from normal data by learning to reconstruct normal-only data alleviates the reliance on labeled datasets. However, many studies, even if unsupervised, rely on a labeled validation set to select the best model for inference from multiple training iterations. For many diseases labeled data are unavailable and substantially time consuming to obtain. To address this, AUCp - a novel metric that supports abnormality detection for unsupervised and self-supervised methods is proposed. Instead of evaluating the realism of reconstructed images to select the best of model for inference, it focuses on actual detection performance and without requiring an annotated test set. Assuming the pseudo ground truth of all unannotated samples in the test set as abnormal/positive and using traditional AUC calculation, AUCp scores are derived. Given a large and representative training set of normal samples, we show mathematical and empirical evidence that model selection using AUCp scores improves disease detection in terms of unsupervised and self-supervised methods over conventional metrics. Using two unsupervised methods for neurologic disease detection and self-supervised methods on diverse datasets, our results demonstrate that the AUCp score effectively identifies the optimal model for inference, significantly enhancing abnormality and disease detection. The corresponding implementations are available in https://github.com/mahfuzmohammad/AUCp.

Problem

Research questions and friction points this paper is trying to address.

abnormality detection

unlabeled validation data

model selection

unsupervised learning

medical image analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

AUCp

unsupervised anomaly detection

model selection