🤖 AI Summary
This study addresses the neglect of cultural differences in facial muscle activation patterns and their impact on recognition performance in existing dynamic facial expression recognition systems, as well as the absence of large-scale multicultural benchmark datasets. To bridge this gap, the authors introduce GCC-FER, the first large-scale cross-cultural dynamic facial expression dataset encompassing African, Caucasian, East Asian, and South Asian demographic groups, along with a culturally aware CA-FER system. The dataset is constructed through psychology-informed video collection and ethnicity-based filtering, while CA-FER incorporates a behavior-driven cultural prior mechanism to adaptively calibrate facial representations and mitigate cultural bias. Experiments demonstrate that CA-FER significantly improves recognition accuracy and robustness across multicultural scenarios on both GCC-FER and DFEW benchmarks.
📝 Abstract
Dynamic Facial Expression Recognition (DFER) is a key enabling technology in affective computing, human-computer interaction, and intelligent multimedia systems. Despite the significant influence of cultural nuances on FER performance, most existing FER systems assume that emotional expressions are universally consistent across populations. This variation can be attributed to systematic differences in facial muscle activation patterns across cultures. A major challenge in advancing cross-cultural FER lies in the scarcity of culturally diverse benchmark datasets. To address this, a new hybrid multicultural video dataset termed Global Cross-Cultural Facial Expression Recognition (GCC-FER) is introduced. GCC-FER comprises 23,934 video samples spanning four cultural groups (African, Caucasian, East Asian, and South Asian) across seven basic expressions, combining psychologically supervised in-house data collection for underrepresented populations with rigorous ethnicity filtering of existing sources. To the best of our knowledge, GCC-FER is the first large-scale global cross-cultural DFER dataset designed to address these demographic gaps. Leveraging this dataset, behaviorally grounded cultural priors are derived for each cultural group and a global prior for practical deployment. A Culture-Aware FER (CA-FER) system is proposed to mitigate cultural bias by adaptively recalibrating latent facial representations. Extensive experiments on GCC-FER and DFEW demonstrate that the proposed system consistently improves FER performance across multicultural settings.