A Data-Centric Approach to Pedestrian Attribute Recognition: Synthetic Augmentation via Prompt-driven Diffusion Models

📅 2025-09-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address weak generalization in pedestrian attribute recognition (PAR) caused by insufficient samples of rare attributes, this paper proposes a data-centric synthetic augmentation method. It introduces text-driven diffusion models to PAR for the first time, generating semantically consistent and attribute-controllable pedestrian images via prompt engineering. A prompt-guided label-aware loss reweighting strategy is designed to dynamically amplify supervision signals for rare attributes. Furthermore, a synthetic data fusion mechanism enables end-to-end training without modifying the underlying model architecture. Extensive experiments on benchmark datasets—including PA-100K and RAPv2—demonstrate significant improvements in rare-attribute recognition accuracy (+5.2% mAP) and zero-shot generalization capability. The method exhibits strong effectiveness, robustness, and cross-dataset scalability, validating its practical utility for real-world PAR applications.

Technology Category

Application Category

📝 Abstract
Pedestrian Attribute Recognition (PAR) is a challenging task as models are required to generalize across numerous attributes in real-world data. Traditional approaches focus on complex methods, yet recognition performance is often constrained by training dataset limitations, particularly the under-representation of certain attributes. In this paper, we propose a data-centric approach to improve PAR by synthetic data augmentation guided by textual descriptions. First, we define a protocol to identify weakly recognized attributes across multiple datasets. Second, we propose a prompt-driven pipeline that leverages diffusion models to generate synthetic pedestrian images while preserving the consistency of PAR datasets. Finally, we derive a strategy to seamlessly incorporate synthetic samples into training data, which considers prompt-based annotation rules and modifies the loss function. Results on popular PAR datasets demonstrate that our approach not only boosts recognition of underrepresented attributes but also improves overall model performance beyond the targeted attributes. Notably, this approach strengthens zero-shot generalization without requiring architectural changes of the model, presenting an efficient and scalable solution to improve the recognition of attributes of pedestrians in the real world.
Problem

Research questions and friction points this paper is trying to address.

Improving recognition of underrepresented pedestrian attributes
Generating synthetic data via prompt-driven diffusion models
Enhancing zero-shot generalization without architectural changes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthetic data augmentation via prompt-driven diffusion
Protocol identifies weakly recognized attributes across datasets
Seamless integration strategy with modified loss function
🔎 Similar Papers
No similar papers found.
Alejandro Alonso
Alejandro Alonso
Autonomous University of Madrid, Madrid, Spain.
S
Sawaiz A. Chaudhry
Autonomous University of Madrid, Madrid, Spain.
J
Juan C. SanMiguel
Autonomous University of Madrid, Madrid, Spain.
Álvaro García-Martín
Álvaro García-Martín
University Autonoma of Madrid
Image & video analysis
P
Pablo Ayuso-Albizu
Autonomous University of Madrid, Madrid, Spain.
Pablo Carballeira
Pablo Carballeira
Universidad Autónoma de Madrid