🤖 AI Summary
This work addresses the challenge of self-supervised representation learning in OCTA images, where sparse vasculature and strong topological constraints hinder effective feature learning. To this end, the authors propose a vessel-aware masked autoencoder framework that integrates vessel saliency with skeleton priors to devise an anatomy-guided, non-uniform masking strategy. By jointly optimizing multi-objective reconstruction tasks, the method simultaneously preserves vascular appearance, structural continuity, and topological fidelity, thereby enabling geometry-aware learning of vessel connectivity and branching patterns. Experiments on the OCTA-500 benchmark demonstrate that the proposed approach significantly outperforms standard masked autoencoders, with particularly notable gains in label-scarce settings.
📝 Abstract
Optical coherence tomography angiography (OCTA) provides non-invasive visualization of retinal microvasculature, but learning robust representations remains challenging due to sparse vessel structures and strong topological constraints. Many existing self-supervised learning approaches, including masked autoencoders, are primarily designed for dense natural images and rely on uniform masking and pixel-level reconstruction, which may inadequately capture vascular geometry.
We propose VAMAE, a vessel-aware masked autoencoding framework for self-supervised pretraining on OCTA images. The approach incorporates anatomically informed masking that emphasizes vessel-rich regions using vesselness and skeleton-based cues, encouraging the model to focus on vascular connectivity and branching patterns. In addition, the pretraining objective includes reconstructing multiple complementary targets, enabling the model to capture appearance, structural, and topological information.
We evaluate the proposed pretraining strategy on the OCTA-500 benchmark for several vessel segmentation tasks under varying levels of supervision. The results indicate that vessel-aware masking and multi-target reconstruction provide consistent improvements over standard masked autoencoding baselines, particularly in limited-label settings, suggesting the potential of geometry-aware self-supervised learning for OCTA analysis.