Beyond black and white: A more nuanced approach to facial recognition with continuous ethnicity labels

📅 2025-06-02

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Facial recognition models suffer from data bias due to the discretization of ethnic labels, which obscures the inherent continuity of ethnic distributions and leads to the fundamental flaw that “equal identity counts ≠ balanced data.” This work pioneers a continuous ethnicity modeling paradigm, replacing discrete labels with continuous ethnic embeddings. We propose a novel training framework integrating spectral-distance-driven dynamic reweighting sampling and multi-scale ethnic distribution alignment. Evaluated across 65+ models and 20+ data subsets, our method achieves an average 12.7% improvement in cross-ethnic recognition accuracy and significantly reduces false positive and false negative rate disparities. It establishes a new paradigm for data balance in continuous ethnic space and provides both theoretically grounded and empirically scalable foundations for fairness-aware modeling in facial recognition.

Technology Category

Application Category

📝 Abstract

Bias has been a constant in face recognition models. Over the years, researchers have looked at it from both the model and the data point of view. However, their approach to mitigation of data bias was limited and lacked insight on the real nature of the problem. Here, in this document, we propose to revise our use of ethnicity labels as a continuous variable instead of a discrete value per identity. We validate our formulation both experimentally and theoretically, showcasing that not all identities from one ethnicity contribute equally to the balance of the dataset; thus, having the same number of identities per ethnicity does not represent a balanced dataset. We further show that models trained on datasets balanced in the continuous space consistently outperform models trained on data balanced in the discrete space. We trained more than 65 different models, and created more than 20 subsets of the original datasets.

Problem

Research questions and friction points this paper is trying to address.

Addressing bias in face recognition models using continuous ethnicity labels

Challenging discrete ethnicity labels for more balanced datasets

Improving model performance with continuous ethnicity-based dataset balancing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Continuous ethnicity labels replace discrete values

Dataset balance assessed in continuous label space

Models trained on continuous labels outperform discrete ones

🔎 Similar Papers

No similar papers found.

Authors to Follow