FIESTA: Fisher Information-based Efficient Selective Test-time Adaptation

📅 2025-03-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the poor robustness of facial expression recognition in wild videos caused by domain shift, this paper proposes a lightweight and efficient test-time adaptation (TTA) paradigm. Methodologically, it introduces a novel Fisher information-based dynamic parameter selection mechanism that updates only ~22K parameters—over 20× fewer than existing TTA methods—and incorporates temporal consistency regularization to model inter-frame dependencies. Empirical analysis shows that reliable importance estimation can be achieved using merely one to three frames. On the AffWild2 benchmark, the approach improves F1-score by 7.7% over state-of-the-art TTA methods, while significantly reducing computational overhead, thereby enabling real-time deployment.

Technology Category

Application Category

📝 Abstract
Robust facial expression recognition in unconstrained,"in-the-wild"environments remains challenging due to significant domain shifts between training and testing distributions. Test-time adaptation (TTA) offers a promising solution by adapting pre-trained models during inference without requiring labeled test data. However, existing TTA approaches typically rely on manually selecting which parameters to update, potentially leading to suboptimal adaptation and high computational costs. This paper introduces a novel Fisher-driven selective adaptation framework that dynamically identifies and updates only the most critical model parameters based on their importance as quantified by Fisher information. By integrating this principled parameter selection approach with temporal consistency constraints, our method enables efficient and effective adaptation specifically tailored for video-based facial expression recognition. Experiments on the challenging AffWild2 benchmark demonstrate that our approach significantly outperforms existing TTA methods, achieving a 7.7% improvement in F1 score over the base model while adapting only 22,000 parameters-more than 20 times fewer than comparable methods. Our ablation studies further reveal that parameter importance can be effectively estimated from minimal data, with sampling just 1-3 frames sufficient for substantial performance gains. The proposed approach not only enhances recognition accuracy but also dramatically reduces computational overhead, making test-time adaptation more practical for real-world affective computing applications.
Problem

Research questions and friction points this paper is trying to address.

Adapts pre-trained models during inference without labeled test data
Dynamically updates critical parameters using Fisher information
Enhances video-based facial expression recognition efficiency and accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fisher-driven selective parameter adaptation
Temporal consistency constraints integration
Minimal data efficient importance estimation
🔎 Similar Papers
No similar papers found.
Mohammadmahdi Honarmand
Mohammadmahdi Honarmand
Stanford University
O
Onur Cezmi Mutlu
Stanford University
P
Parnian Azizian
Stanford University
S
Saimourya Surabhi
Stanford University
Dennis P. Wall
Dennis P. Wall
Professor, Stanford University
AIPediatricsDigital HealthClinical InformaticsBiomedical Data Science