Evaluating Test-Time Adaptation For Facial Expression Recognition Under Natural Cross-Dataset Distribution Shifts

📅 2026-03-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the performance degradation of facial expression recognition (FER) models in real-world scenarios due to natural distribution shifts across datasets—arising from differences in acquisition protocols, annotation criteria, and population demographics. For the first time, it systematically evaluates the effectiveness of test-time adaptation (TTA) methods under natural (non-synthetic) distribution shifts through multi-dataset experiments employing strategies such as TENT, SAR, T3A, and SHOT. The findings reveal that the performance of TTA methods is critically governed by the distributional distance and noise level of the target domain: TTA improves accuracy by up to 11.34%, with different methods exhibiting distinct advantages depending on whether the target domain is clean, highly shifted, or noisy. Notably, adaptation efficacy is strongly modulated by the magnitude of distributional divergence.

Technology Category

Application Category

📝 Abstract
Deep learning models often struggle under natural distribution shifts, a common challenge in real-world deployments. Test-Time Adaptation (TTA) addresses this by adapting models during inference without labeled source data. We present the first evaluation of TTA methods for FER under natural domain shifts, performing cross-dataset experiments with widely used FER datasets. This moves beyond synthetic corruptions to examine real-world shifts caused by differing collection protocols, annotation standards, and demographics. Results show TTA can boost FER performance under natural shifts by up to 11.34\%. Entropy minimization methods such as TENT and SAR perform best when the target distribution is clean. In contrast, prototype adjustment methods like T3A excel under larger distributional distance scenarios. Finally, feature alignment methods such as SHOT deliver the largest gains when the target distribution is noisier than our source. Our cross-dataset analysis shows that TTA effectiveness is governed by the distributional distance and the severity of the natural shift across domains.
Problem

Research questions and friction points this paper is trying to address.

Test-Time Adaptation
Facial Expression Recognition
Distribution Shift
Cross-Dataset Evaluation
Domain Generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Test-Time Adaptation
Facial Expression Recognition
Natural Distribution Shift
Cross-Dataset Evaluation
Domain Adaptation
🔎 Similar Papers
No similar papers found.
J
John Turnbull
Department of Electrical and Computer Engineering, Queen’s University, Kingston, Canada
S
Shivam Grover
Department of Electrical and Computer Engineering, Queen’s University, Kingston, Canada
Amin Jalali
Amin Jalali
Postdoctoral fellow, Queen’s University
Artificial IntelligenceMedical ImagingFoundation ModelsTime SeriesMulti-Modal Learning
Ali Etemad
Ali Etemad
Queen's University
Artificial IntelligenceDeep LearningHuman-Centered AIWearable ComputingAffective Computing