🤖 AI Summary
In semi-supervised semantic segmentation (SSSS), conventional spatial augmentations (e.g., rotation, translation) induce semantic mask inconsistency between strong and weak views, undermining consistency regularization. To address this, we propose the first differentiable spatial transformation framework for SSSS, featuring an entropy-driven instance-level adaptive augmentation strategy: augmentation intensity is dynamically modulated based on prediction uncertainty to mitigate pseudo-label noise. Our method integrates weak–strong consistency regularization, a plug-and-play module design, and multi-scale feature alignment—ensuring seamless compatibility with mainstream SSSS frameworks. Extensive experiments demonstrate state-of-the-art performance on PASCAL VOC 2012, Cityscapes, and COCO, significantly outperforming existing methods in accuracy. The proposed module is lightweight and readily deployable across diverse architectures without architectural modifications.
📝 Abstract
In semi-supervised semantic segmentation (SSSS), data augmentation plays a crucial role in the weak-to-strong consistency regularization framework, as it enhances diversity and improves model generalization. Recent strong augmentation methods have primarily focused on intensity-based perturbations, which have minimal impact on the semantic masks. In contrast, spatial augmentations like translation and rotation have long been acknowledged for their effectiveness in supervised semantic segmentation tasks, but they are often ignored in SSSS. In this work, we demonstrate that spatial augmentation can also contribute to model training in SSSS, despite generating inconsistent masks between the weak and strong augmentations. Furthermore, recognizing the variability among images, we propose an adaptive augmentation strategy that dynamically adjusts the augmentation for each instance based on entropy. Extensive experiments show that our proposed Adaptive Spatial Augmentation ( extbf{ASAug}) can be integrated as a pluggable module, consistently improving the performance of existing methods and achieving state-of-the-art results on benchmark datasets such as PASCAL VOC 2012, Cityscapes, and COCO.