Functional Localization Enforced Deep Anomaly Detection Using Fundus Images

📅 2025-11-23

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Retinal disease detection in fundus images faces challenges including substantial variability in imaging quality, subtle early-stage lesions, and cross-dataset domain shift. Method: We propose a dual-path framework integrating a Vision Transformer (ViT) classifier with a GANomaly-based anomaly detector. To enhance clinical interpretability, we introduce functional localization constraints; for threshold-free clinical decision support, we employ GUESS-based probability calibration. The method further incorporates geometric/color augmentation, histogram equalization, and multi-dataset joint transfer learning to improve generalizability. Contribution/Results: On multiple public datasets, the ViT achieves accuracy of 0.789–0.843 and an AUC of 0.91 on the Papila dataset—significantly outperforming CNN baselines. The anomaly detector attains an AUC of 0.76, demonstrating both reconstruction-based interpretability and robustness against domain shifts.

Technology Category

Application Category

📝 Abstract

Reliable detection of retinal diseases from fundus images is challenged by the variability in imaging quality, subtle early-stage manifestations, and domain shift across datasets. In this study, we systematically evaluated a Vision Transformer (ViT) classifier under multiple augmentation and enhancement strategies across several heterogeneous public datasets, as well as the AEyeDB dataset, a high-quality fundus dataset created in-house and made available for the research community. The ViT demonstrated consistently strong performance, with accuracies ranging from 0.789 to 0.843 across datasets and diseases. Diabetic retinopathy and age-related macular degeneration were detected reliably, whereas glaucoma remained the most frequently misclassified disease. Geometric and color augmentations provided the most stable improvements, while histogram equalization benefited datasets dominated by structural subtlety. Laplacian enhancement reduced performance across different settings. On the Papila dataset, the ViT with geometric augmentation achieved an AUC of 0.91, outperforming previously reported convolutional ensemble baselines (AUC of 0.87), underscoring the advantages of transformer architectures and multi-dataset training. To complement the classifier, we developed a GANomaly-based anomaly detector, achieving an AUC of 0.76 while providing inherent reconstruction-based explainability and robust generalization to unseen data. Probabilistic calibration using GUESS enabled threshold-independent decision support for future clinical implementation.

Problem

Research questions and friction points this paper is trying to address.

Detecting retinal diseases from fundus images with variable quality

Addressing domain shift across multiple heterogeneous fundus datasets

Improving detection of subtle early-stage disease manifestations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision Transformer classifier for retinal disease detection

GANomaly-based anomaly detector for explainable diagnosis

Probabilistic calibration enabling threshold-independent decision support

🔎 Similar Papers

Hierarchical Salient Patch Identification for Interpretable Fundus Disease Localization

2024-05-23arXiv.orgCitations: 0

Authors to Follow