R&D: Balancing Reliability and Diversity in Synthetic Data Augmentation for Semantic Segmentation

📅 2026-03-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing data augmentation techniques, which often fail to generate semantically novel yet label-consistent samples, and the tendency of general-purpose generative models to disrupt label alignment in pixel-level tasks such as semantic segmentation. To overcome these challenges, the authors propose a synthetic data augmentation approach based on controllable diffusion models that integrates class-aware prompts with visual priors. This enables the generation of images with controllable semantic structures and precisely aligned segmentation labels. The method preserves diversity in synthetic data while significantly enhancing its reliability, leading to consistent performance gains in semantic segmentation on benchmarks including PASCAL VOC and BDD100K. Notably, it substantially improves model robustness under data-scarce conditions.

Technology Category

Application Category

📝 Abstract
Collecting and annotating datasets for pixel-level semantic segmentation tasks are highly labor-intensive. Data augmentation provides a viable solution by enhancing model generalization without additional real-world data collection. Traditional augmentation techniques, such as translation, scaling, and color transformations, create geometric variations but fail to generate new structures. While generative models have been employed to extend semantic information of datasets, they often struggle to maintain consistency between the original and generated images, particularly for pixel-level tasks. In this work, we propose a novel synthetic data augmentation pipeline that integrates controllable diffusion models. Our approach balances diversity and reliability data, effectively bridging the gap between synthetic and real data. We utilize class-aware prompting and visual prior blending to improve image quality further, ensuring precise alignment with segmentation labels. By evaluating benchmark datasets such as PASCAL VOC and BDD100K, we demonstrate that our method significantly enhances semantic segmentation performance, especially in data-scarce scenarios, while improving model robustness in real-world applications. Our code is available at \href{https://github.com/chequanghuy/Enhanced-Generative-Data-Augmentation-for-Semantic-Segmentation-via-Stronger-Guidance}{https://github.com/chequanghuy/Enhanced-Generative-Data-Augmentation-for-Semantic-Segmentation-via-Stronger-Guidance}.
Problem

Research questions and friction points this paper is trying to address.

semantic segmentation
synthetic data augmentation
data reliability
data diversity
pixel-level consistency
Innovation

Methods, ideas, or system contributions that make the work stand out.

controllable diffusion models
class-aware prompting
visual prior blending
synthetic data augmentation
semantic segmentation
🔎 Similar Papers
No similar papers found.
Q
Quang-Huy Che
University of Information Technology, Ho Chi Minh City, Vietnam; Vietnam National University, Ho Chi Minh City, Vietnam
Dinh-Duy Phan
Dinh-Duy Phan
University of Information Technology, Vietnam National University Ho Chi Minh city
Computer ScienceComputer EngineeringMachine LearningDeep Learning
D
Duc-Khai Lam
University of Information Technology, Ho Chi Minh City, Vietnam; Vietnam National University, Ho Chi Minh City, Vietnam