Speech Emotion Recognition with ASR Integration

📅 2026-01-25

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenges of speech emotion recognition (SER) in real-world, spontaneous, and low-resource settings, where emotional expression complexity and technical limitations hinder performance. It presents the first systematic integration of automatic speech recognition (ASR) into the SER pipeline, enabling deep fusion of acoustic and textual modalities through joint modeling of speech signals and ASR-generated transcripts. By leveraging complementary information from both modalities, the proposed approach significantly improves recognition accuracy and system robustness under low-resource and spontaneous speech conditions. This advancement enhances the scalability and adaptability of SER systems in practical applications, paving the way for more effective deployment in real-life scenarios characterized by limited labeled data and naturalistic speech variability.

Technology Category

Application Category

📝 Abstract

Speech Emotion Recognition (SER) plays a pivotal role in understanding human communication, enabling emotionally intelligent systems, and serving as a fundamental component in the development of Artificial General Intelligence (AGI). However, deploying SER in real-world, spontaneous, and low-resource scenarios remains a significant challenge due to the complexity of emotional expression and the limitations of current speech and language technologies. This thesis investigates the integration of Automatic Speech Recognition (ASR) into SER, with the goal of enhancing the robustness, scalability, and practical applicability of emotion recognition from spoken language.

Problem

Research questions and friction points this paper is trying to address.

Speech Emotion Recognition

Automatic Speech Recognition

low-resource scenarios

emotional expression

real-world deployment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Speech Emotion Recognition

Automatic Speech Recognition

Emotion Recognition Integration

Low-resource SER

Multimodal Emotion Analysis

🔎 Similar Papers

No similar papers found.

Authors to Follow