DELTA: Deliberative Multi-Agent Reasoning with Reinforcement Learning for Multimodal Psychological Counseling

πŸ“… 2026-02-04
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work proposes a multi-agent framework for empathetic counseling that addresses the limitations of existing language model–based systems, which often rely solely on text and lack explicit integration of multimodal cues and structured psychological reasoning. The framework models the counseling process as a structured reasoning pipeline over multimodal signals, sequentially performing evidence anchoring, psychological state abstraction, and empathetic response generation. A key innovation is the introduction of distribution-level emotional alignment as a reinforcement learning reward signal, which explicitly fuses multimodal inputs with structured representations of psychological states. Experimental results on a multimodal counseling benchmark demonstrate significant improvements in both empathy quality and emotional alignment, while ablation studies confirm the effectiveness and complementary nature of each component in the proposed architecture.

Technology Category

Application Category

πŸ“ Abstract
Psychological counseling is a fundamentally multimodal cognitive process in which clinicians integrate verbal content with visual and vocal cues to infer clients'mental states and respond empathically. However, most existing language-model-based counseling systems operate on text alone and rely on implicit mental state inference. We introduce DELTA, a deliberative multi-agent framework that models counseling as a structured reasoning process over multimodal signals, separating evidence grounding, mental state abstraction, and response generation. DELTA further incorporates reinforcement learning guided by a distribution-level Emotion Attunement Score to encourage emotionally attuned responses. Experiments on a multimodal counseling benchmark show that DELTA improves both counseling quality and emotion attunement across models. Ablation and qualitative analyses suggest that explicit multimodal reasoning and structured mental state representations play complementary roles in supporting empathic human-AI interaction.
Problem

Research questions and friction points this paper is trying to address.

multimodal counseling
mental state inference
empathic response
language model limitations
psychological counseling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deliberative Multi-Agent Reasoning
Multimodal Psychological Counseling
Reinforcement Learning
Emotion Attunement
Structured Mental State Representation
πŸ”Ž Similar Papers
No similar papers found.
J
Jiangnan Yang
Anhui University; Institute of Artificial Intelligence, Hefei Comprehensive National Science Center
J
Junjie Chen
Hefei University of Technology; Institute of Artificial Intelligence, Hefei Comprehensive National Science Center
Fei Wang
Fei Wang
Hefei University of Technology
Motion MagnificationMLLMAffective Computing
Y
Yiqi Nie
Anhui University; Institute of Artificial Intelligence, Hefei Comprehensive National Science Center
Y
Yuxin Liu
Anhui University; Institute of Artificial Intelligence, Hefei Comprehensive National Science Center
Z
Zhangling Duan
Institute of Artificial Intelligence, Hefei Comprehensive National Science Center
Jie Chen
Jie Chen
Anhui Normal University
Complex NetworkTraffic Flow