🤖 AI Summary
Synthetic data-trained face recognition models suffer from generator-specific artifacts, overfitting, and insufficient diversity in pose, illumination, and demographic attributes. To address these limitations, this work proposes a cross-architecture synthetic data fusion strategy: it jointly leverages two heterogeneous generative models—e.g., GANs and diffusion models—to construct a high-fidelity, highly diverse synthetic face dataset, and introduces an implicit regularization mechanism to strengthen identity-discriminative feature learning. This approach effectively mitigates model-specific artifacts and enhances generalization. Extensive evaluation on standard benchmarks—including LFW, CFP-FP, and AgeDB-30—demonstrates that the proposed method significantly outperforms existing synthetic-data-based approaches, achieving performance on par with or even surpassing certain real-data-trained baselines. The framework establishes a new paradigm for face recognition that is cost-effective, privacy-preserving, and robust across demographic and environmental variations.
📝 Abstract
While the accuracy of face recognition systems has improved significantly in recent years, the datasets used to train these models are often collected through web crawling without the explicit consent of users, raising ethical and privacy concerns. To address this, many recent approaches have explored the use of synthetic data for training face recognition models. However, these models typically underperform compared to those trained on real-world data. A common limitation is that a single generator model is often used to create the entire synthetic dataset, leading to model-specific artifacts that may cause overfitting to the generator's inherent biases and artifacts. In this work, we propose a solution by combining two state-of-the-art synthetic face datasets generated using architecturally distinct backbones. This fusion reduces model-specific artifacts, enhances diversity in pose, lighting, and demographics, and implicitly regularizes the face recognition model by emphasizing identity-relevant features. We evaluate the performance of models trained on this combined dataset using standard face recognition benchmarks and demonstrate that our approach achieves superior performance across many of these benchmarks.