Functional Localization Enforced Deep Anomaly Detection Using Fundus Images

📅 2025-11-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Retinal disease detection in fundus images faces challenges including substantial variability in imaging quality, subtle early-stage lesions, and cross-dataset domain shift. Method: We propose a dual-path framework integrating a Vision Transformer (ViT) classifier with a GANomaly-based anomaly detector. To enhance clinical interpretability, we introduce functional localization constraints; for threshold-free clinical decision support, we employ GUESS-based probability calibration. The method further incorporates geometric/color augmentation, histogram equalization, and multi-dataset joint transfer learning to improve generalizability. Contribution/Results: On multiple public datasets, the ViT achieves accuracy of 0.789–0.843 and an AUC of 0.91 on the Papila dataset—significantly outperforming CNN baselines. The anomaly detector attains an AUC of 0.76, demonstrating both reconstruction-based interpretability and robustness against domain shifts.

Technology Category

Application Category

📝 Abstract
Reliable detection of retinal diseases from fundus images is challenged by the variability in imaging quality, subtle early-stage manifestations, and domain shift across datasets. In this study, we systematically evaluated a Vision Transformer (ViT) classifier under multiple augmentation and enhancement strategies across several heterogeneous public datasets, as well as the AEyeDB dataset, a high-quality fundus dataset created in-house and made available for the research community. The ViT demonstrated consistently strong performance, with accuracies ranging from 0.789 to 0.843 across datasets and diseases. Diabetic retinopathy and age-related macular degeneration were detected reliably, whereas glaucoma remained the most frequently misclassified disease. Geometric and color augmentations provided the most stable improvements, while histogram equalization benefited datasets dominated by structural subtlety. Laplacian enhancement reduced performance across different settings. On the Papila dataset, the ViT with geometric augmentation achieved an AUC of 0.91, outperforming previously reported convolutional ensemble baselines (AUC of 0.87), underscoring the advantages of transformer architectures and multi-dataset training. To complement the classifier, we developed a GANomaly-based anomaly detector, achieving an AUC of 0.76 while providing inherent reconstruction-based explainability and robust generalization to unseen data. Probabilistic calibration using GUESS enabled threshold-independent decision support for future clinical implementation.
Problem

Research questions and friction points this paper is trying to address.

Detecting retinal diseases from fundus images with variable quality
Addressing domain shift across multiple heterogeneous fundus datasets
Improving detection of subtle early-stage disease manifestations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision Transformer classifier for retinal disease detection
GANomaly-based anomaly detector for explainable diagnosis
Probabilistic calibration enabling threshold-independent decision support
J
Jan Benedikt Ruhland
Heinrich Heine University Düsseldorf, Faculty of Mathematics and Natural Sciences, Düsseldorf, Germany
Thorsten Papenbrock
Thorsten Papenbrock
Philipps University of Marburg, Department of Mathematics and Computer Science, Marburg, Germany
J
Jan-Peter Sowa
University Hospital Knappschaftskrankenhaus Bochum, Department of Internal Medicine, Germany
A
Ali Canbay
University Hospital Knappschaftskrankenhaus Bochum, Department of Internal Medicine, Germany
N
Nicole Eter
University of Münster, Department of Ophthalmology, Münster, Germany
Bernd Freisleben
Bernd Freisleben
Professor of Computer Science, University of Marburg, Germany
Dominik Heider
Dominik Heider
Director, University of Münster
Data ScienceMachine LearningArtificial IntelligenceBiomedical InformaticsSaMD