iMiGUE-Speech: A Spontaneous Speech Dataset for Affective Analysis

πŸ“… 2026-02-24
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the scarcity of spontaneous emotional speech data in naturalistic settings, as most existing datasets rely on acted or laboratory-induced emotions. To bridge this gap, the authors construct a novel multimodal dataset comprising spontaneous emotional speech elicited by real-time competitive outcomes, uniquely synchronized with fine-grained micro-gesture annotations captured in authentic contexts. The dataset includes transcribed utterances, speaker-role disentanglement, and word-level alignment, enabling comprehensive multimodal analysis. Leveraging pretrained acoustic and language models, the study establishes a dual-modality (speech and text) benchmark for emotion recognition. Experimental results demonstrate the dataset’s effectiveness and distinctive value in capturing genuine, spontaneous affective states, offering a valuable resource for advancing research in naturalistic emotion modeling.

Technology Category

Application Category

πŸ“ Abstract
This work presents iMiGUE-Speech, an extension of the iMiGUE dataset that provides a spontaneous affective corpus for studying emotional and affective states. The new release focuses on speech and enriches the original dataset with additional metadata, including speech transcripts, speaker-role separation between interviewer and interviewee, and word-level forced alignments. Unlike existing emotional speech datasets that rely on acted or laboratory-elicited emotions, iMiGUE-Speech captures spontaneous affect arising naturally from real match outcomes. To demonstrate the utility of the dataset and establish initial benchmarks, we introduce two evaluation tasks for comparative assessment: speech emotion recognition and transcript-based sentiment analysis. These tasks leverage state-of-the-art pre-trained representations to assess the dataset's ability to capture spontaneous affective states from both acoustic and linguistic modalities. iMiGUE-Speech can also be synchronously paired with micro-gesture annotations from the original iMiGUE dataset, forming a uniquely multimodal resource for studying speech-gesture affective dynamics. The extended dataset is available at https://github.com/CV-AC/imigue-speech.
Problem

Research questions and friction points this paper is trying to address.

spontaneous speech
affective analysis
emotional speech dataset
natural affect
Innovation

Methods, ideas, or system contributions that make the work stand out.

spontaneous speech
affective analysis
multimodal dataset
forced alignment
emotion recognition
πŸ”Ž Similar Papers
No similar papers found.