Understanding Emotion in Discourse: Recognition Insights and Linguistic Patterns for Generation

📅 2026-01-01
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the lack of systematic understanding regarding key architectural choices in conversational emotion recognition and the neglect of pragmatic-linguistic mechanisms underlying emotional expression. Through ablation studies and discourse marker analysis on the IEMOCAP dataset, we investigate the practical contributions of contextual information, intra-utterance structure, and sentiment lexicons. Our findings reveal that performance saturates within 10–30 dialogue turns, intra-utterance hierarchical representations become ineffective once context is incorporated, and sentiment lexicons yield no significant gains. Notably, we discover a significant reduction in left-peripheral discourse markers in utterances expressing sadness (21.9% vs. 28–32%, p<0.0001), elucidating its strong contextual dependency. Using only a causal contextual architecture, our model achieves weighted F1 scores of 82.69% (4-class) and 67.07% (6-class), surpassing prior text-only approaches.

Technology Category

Application Category

📝 Abstract
Despite strong recent progress in Emotion Recognition in Conversation (ERC), two gaps remain: we lack clear understanding of which modeling choices materially affect performance, and we have limited linguistic analysis linking recognition findings to actionable generation cues. We address both via a systematic study on IEMOCAP. For recognition, we conduct controlled ablations with 10 random seeds and paired tests (with correction for multiple comparisons), yielding three findings. First, conversational context is dominant: performance saturates quickly, with roughly 90% of gain achieved using only the most recent 10-30 preceding turns. Second, hierarchical sentence representations improve utterance-only recognition (K=0), but the benefit vanishes once turn-level context is available, suggesting conversational history subsumes intra-utterance structure. Third, external affective lexicon (SenticNet) integration does not improve results, consistent with pretrained encoders already capturing affective signal. Under strictly causal (past-only) setting, our simple models attain strong performance (82.69% 4-way; 67.07% 6-way weighted F1). For linguistic analysis, we examine 5,286 discourse-marker occurrences and find reliable association between emotion and marker position (p<0.0001). Sad utterances show reduced left-periphery marker usage (21.9%) relative to other emotions (28-32%), aligning with accounts linking left-periphery markers to active discourse management. This pattern is consistent with Sad benefiting most from conversational context (+22%p), suggesting sadness relies more on discourse history than overt pragmatic signaling.
Problem

Research questions and friction points this paper is trying to address.

Emotion Recognition in Conversation
architectural choices
linguistic analysis
discourse markers
conversational context
Innovation

Methods, ideas, or system contributions that make the work stand out.

Emotion Recognition in Conversation
Discourse Markers
Contextual Ablation Study
Linguistic Patterns
Causal Context Modeling
🔎 Similar Papers
No similar papers found.
Cheonkam Jeong
Cheonkam Jeong
University of California, Irvine
Computational LinguisticsNatural Language ProcessingComputational Social Science
A
Adeline M. Nyamathi
Sue & Bill Gross School of Nursing, University of California, Irvine