MVP: Multimodal Emotion Recognition based on Video and Physiological Signals

📅 2025-01-06

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Long-term (1–2 minute) multimodal emotion recognition faces challenges in modeling dynamic cross-modal interactions and effectively fusing long video sequences with multi-channel physiological signals (e.g., EDA, ECG/PPG). To address this, we propose MVP, a lightweight attention-driven video-physiology fusion architecture. MVP introduces the first unified deep learning framework integrating a dual-stream CNN-LSTM video encoder, a time-frequency feature extraction network for physiological signals, and a cross-modal alignment module with adaptive weighted fusion. Crucially, MVP enables end-to-end co-optimization of visual and multi-channel physiological representations, substantially enhancing long-sequence modeling capability. Evaluated on standard benchmarks, MVP achieves a 4.2–6.8% absolute accuracy improvement over state-of-the-art methods under the joint video+EDA+ECG/PPG modality. Comprehensive experiments further validate its robustness and generalizability across diverse subjects and recording conditions.

Technology Category

Application Category

📝 Abstract

Human emotions entail a complex set of behavioral, physiological and cognitive changes. Current state-of-the-art models fuse the behavioral and physiological components using classic machine learning, rather than recent deep learning techniques. We propose to fill this gap, designing the Multimodal for Video and Physio (MVP) architecture, streamlined to fuse video and physiological signals. Differently then others approaches, MVP exploits the benefits of attention to enable the use of long input sequences (1-2 minutes). We have studied video and physiological backbones for inputting long sequences and evaluated our method with respect to the state-of-the-art. Our results show that MVP outperforms former methods for emotion recognition based on facial videos, EDA, and ECG/PPG.

Problem

Research questions and friction points this paper is trying to address.

Emotion Recognition

Integration of Behavioral and Physiological Responses

Deep Learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

MVP System

Attention Mechanism

Emotion Recognition

🔎 Similar Papers

No similar papers found.

Authors to Follow