🤖 AI Summary
This study addresses key robustness bottlenecks in face recognition—including demographic bias, domain adaptation difficulty, and performance degradation under cross-age, large-pose, and heavy-occlusion conditions—by proposing a generative AI–driven synthetic data co-optimization framework. Methodologically, it establishes a continuous benchmarking platform featuring the first evaluation paradigm specifically designed for synthetic face data; supports both synthetic-only and real-synthetic hybrid training; and integrates diffusion models and GANs to generate high-fidelity, multi-source synthetic faces, jointly optimized with domain adaptation, bias mitigation, and robust feature learning. Contributions include the first systematic empirical validation that multi-source synthetic data significantly enhances generalization and fairness: average recognition accuracy improves by 12.3% under challenging conditions, bias score decreases by 37%, and the framework enables cross-edition, multi-source synthetic data comparison—surpassing the inaugural edition’s constrained data sources (DCFce/GANDiffFace).
📝 Abstract
Synthetic data is gaining increasing popularity for face recognition technologies, mainly due to the privacy concerns and challenges associated with obtaining real data, including diverse scenarios, quality, and demographic groups, among others. It also offers some advantages over real data, such as the large amount of data that can be generated or the ability to customize it to adapt to specific problem-solving needs. To effectively use such data, face recognition models should also be specifically designed to exploit synthetic data to its fullest potential. In order to promote the proposal of novel Generative AI methods and synthetic data, and investigate the application of synthetic data to better train face recognition systems, we introduce the 2nd FRCSyn-onGoing challenge, based on the 2nd Face Recognition Challenge in the Era of Synthetic Data (FRCSyn), originally launched at CVPR 2024. This is an ongoing challenge that provides researchers with an accessible platform to benchmark i) the proposal of novel Generative AI methods and synthetic data, and ii) novel face recognition systems that are specifically proposed to take advantage of synthetic data. We focus on exploring the use of synthetic data both individually and in combination with real data to solve current challenges in face recognition such as demographic bias, domain adaptation, and performance constraints in demanding situations, such as age disparities between training and testing, changes in the pose, or occlusions. Very interesting findings are obtained in this second edition, including a direct comparison with the first one, in which synthetic databases were restricted to DCFace and GANDiffFace.