Psy-Insight: Explainable Multi-turn Bilingual Dataset for Mental Health Counseling

📅 2025-03-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
A critical bottleneck hindering the deep application of large language models (LLMs) in mental health support is the absence of high-quality, multi-task Chinese annotated corpora. To address this, we introduce the first explainable, multilingual (Chinese–English), multi-task counseling dialogue dataset, constructed from transcribed real-world face-to-face multi-turn psychotherapy sessions. It features fine-grained, ontology-guided annotations across therapeutic orientations, emotional states, intervention strategies, and thematic topics, complemented by turn-level reasoning chains and session-level instructional explanations. Our methodology innovatively integrates process-aware logical explanation with a multi-task annotation framework, underpinned by a counseling ontology, bilingual alignment, and multi-tier human annotation. Experiments demonstrate substantial improvements in LLMs’ capabilities for counseling style adaptation, strategy identification, and explainable reasoning generation—marking the first successful modeling and replication of professional therapeutic logic and empathic responsiveness.

Technology Category

Application Category

📝 Abstract
The in-context learning capabilities of large language models (LLMs) show great potential in mental health support. However, the lack of counseling datasets, particularly in Chinese corpora, restricts their application in this field. To address this, we constructed Psy-Insight, the first mental health-oriented explainable multi-task bilingual dataset. We collected face-to-face multi-turn counseling dialogues, which are annotated with multi-task labels and conversation process explanations. Our annotations include psychotherapy, emotion, strategy, and topic labels, as well as turn-level reasoning and session-level guidance. Psy-Insight is not only suitable for tasks such as label recognition but also meets the need for training LLMs to act as empathetic counselors through logical reasoning. Experiments show that training LLMs on Psy-Insight enables the models to not only mimic the conversation style but also understand the underlying strategies and reasoning of counseling.
Problem

Research questions and friction points this paper is trying to address.

Lack of Chinese mental health counseling datasets
Need for explainable multi-task bilingual datasets
Training LLMs for empathetic counseling and reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

First bilingual dataset for mental health counseling
Multi-task labels with detailed conversation explanations
Enhances LLMs' empathetic counseling through logical reasoning
🔎 Similar Papers
No similar papers found.
K
Keqi Chen
Beijing University of Posts and Telecommunications, Beijing
Z
Zekai Sun
Beijing University of Posts and Telecommunications, Beijing
Y
Yuhua Wen
Beijing University of Posts and Telecommunications, Beijing
H
Huijun Lian
Beijing University of Posts and Telecommunications, Beijing
Yingming Gao
Yingming Gao
Beijing University of Posts and Telecommunications
Computer Assisted Language LearningAcoustic Phonetics and Speech Synthesis
Y
Ya Li
Beijing University of Posts and Telecommunications, Beijing