Structured Prompting and LLM Ensembling for Multimodal Conversational Aspect-based Sentiment Analysis

📅 2025-12-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses fine-grained sentiment understanding in multimodal dialogues, tackling two core challenges: (1) joint extraction of cross-speaker sentiment hexad elements—holder, target, aspect, opinion, sentiment, and rationale—and (2) precise detection of sentiment reversal, i.e., dynamic sentiment shifts and their triggering causes. We propose a structured stepwise prompting mechanism to guide large language models (LLMs) through hierarchical sentiment element parsing. Additionally, we design a multi-LLM complementary ensemble framework that integrates multimodal contextual modeling with sequential component analysis to capture sentiment dynamics. Experiments demonstrate state-of-the-art performance: 47.38% average F1 for hexad extraction and 74.12% exact-match F1 for sentiment reversal detection—substantially outperforming existing baselines. To our knowledge, this is the first work to systematically address structured sentiment evolution modeling in multimodal dialogues.

Technology Category

Application Category

📝 Abstract
Understanding sentiment in multimodal conversations is a complex yet crucial challenge toward building emotionally intelligent AI systems. The Multimodal Conversational Aspect-based Sentiment Analysis (MCABSA) Challenge invited participants to tackle two demanding subtasks: (1) extracting a comprehensive sentiment sextuple, including holder, target, aspect, opinion, sentiment, and rationale from multi-speaker dialogues, and (2) detecting sentiment flipping, which detects dynamic sentiment shifts and their underlying triggers. For Subtask-I, in the present paper, we designed a structured prompting pipeline that guided large language models (LLMs) to sequentially extract sentiment components with refined contextual understanding. For Subtask-II, we further leveraged the complementary strengths of three LLMs through ensembling to robustly identify sentiment transitions and their triggers. Our system achieved a 47.38% average score on Subtask-I and a 74.12% exact match F1 on Subtask-II, showing the effectiveness of step-wise refinement and ensemble strategies in rich, multimodal sentiment analysis tasks.
Problem

Research questions and friction points this paper is trying to address.

Extract sentiment sextuple from multimodal dialogues using structured prompting
Detect sentiment flipping and triggers via LLM ensembling
Enhance multimodal conversational sentiment analysis with step-wise refinement
Innovation

Methods, ideas, or system contributions that make the work stand out.

Structured prompting pipeline guides LLMs for sequential extraction
Ensembling three LLMs leverages complementary strengths for transitions
Step-wise refinement and ensemble strategies enhance multimodal analysis
🔎 Similar Papers
No similar papers found.
Z
Zhiqiang Gao
Hunan University, Changsha, Hunan, China
S
Shihao Gao
Hunan University, Changsha, Hunan, China
Zixing Zhang
Zixing Zhang
Professor, Hunan University
Artifical IntelligenceSpeech ProcessingAffective ComputingDigital HealthAutomatic Speech Recognition
Y
Yihao Guo
Hunan University, Changsha, Hunan, China
H
Hongyu Chen
Hunan University, Changsha, Hunan, China
Jing Han
Jing Han
University of Cambridge
deep learningaudio signal processingmachine learningmHealthaffective computing