π€ AI Summary
Early and accurate grading of diabetic retinopathy (DR) is critical for preventing blindness, yet it remains challenging due to heterogeneous image quality, severe class imbalance, and high morphological similarity among lesions. To address these issues, we propose an end-to-end multi-level DR classification framework. Our method introduces a robust preprocessing strategy that jointly employs Contrast-Limited Adaptive Histogram Equalization (CLAHE) enhancement and class-aware data augmentation. Furthermore, we integrate the Swin Transformer with its shifted-window self-attention mechanism to efficiently capture multi-scale retinal lesion features under linear computational complexity. Evaluated on the Aptos and IDRiD datasets, our framework achieves 89.65% and 97.40% classification accuracy, respectively, with notably improved sensitivity for mild DR detection. The approach demonstrates strong generalizability and clinical interpretability, offering a reliable technical foundation for automated DR screening in primary healthcare settings.
π Abstract
Diabetic retinopathy (DR) is a leading cause of blindness worldwide, underscoring the importance of early detection for effective treatment. However, automated DR classification remains challenging due to variations in image quality, class imbalance, and pixel-level similarities that hinder model training. To address these issues, we propose a robust preprocessing pipeline incorporating image cropping, Contrast-Limited Adaptive Histogram Equalization (CLAHE), and targeted data augmentation to improve model generalization and resilience. Our approach leverages the Swin Transformer, which utilizes hierarchical token processing and shifted window attention to efficiently capture fine-grained features while maintaining linear computational complexity. We validate our method on the Aptos and IDRiD datasets for multi-class DR classification, achieving accuracy rates of 89.65% and 97.40%, respectively. These results demonstrate the effectiveness of our model, particularly in detecting early-stage DR, highlighting its potential for improving automated retinal screening in clinical settings.