Hierarchical Dual-Head Model for Suicide Risk Assessment via MentalRoBERTa

📅 2025-10-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Automated suicide risk detection in social media faces three key challenges: severe class imbalance, complex temporal dynamics in posting sequences, and the dual ordinal–categorical nature of risk levels. To address these, we propose a hierarchical dual-head neural network built upon MentalRoBERTa. Our approach innovatively integrates CORAL ordinal regression with standard classification in a dual-head architecture to explicitly model both the ordered structure and categorical distinctions among risk levels. We incorporate time-interval embeddings and employ a 3-layer Transformer encoder (8 attention heads) to capture temporal dependencies. Additionally, we enhance robustness and efficiency via mixed-precision training, freezing lower-layer parameters, and a joint loss function combining CORAL loss, cross-entropy loss, and Focal Loss. Evaluated on a four-level risk classification task using stratified 5-fold cross-validation, our method achieves a statistically significant improvement in macro-F1 score, effectively mitigating overfitting and overconfidence.

Technology Category

Application Category

📝 Abstract
Social media platforms have become important sources for identifying suicide risk, but automated detection systems face multiple challenges including severe class imbalance, temporal complexity in posting patterns, and the dual nature of risk levels as both ordinal and categorical. This paper proposes a hierarchical dual-head neural network based on MentalRoBERTa for suicide risk classification into four levels: indicator, ideation, behavior, and attempt. The model employs two complementary prediction heads operating on a shared sequence representation: a CORAL (Consistent Rank Logits) head that preserves ordinal relationships between risk levels, and a standard classification head that enables flexible categorical distinctions. A 3-layer Transformer encoder with 8-head multi-head attention models temporal dependencies across post sequences, while explicit time interval embeddings capture posting behavior dynamics. The model is trained with a combined loss function (0.5 CORAL + 0.3 Cross-Entropy + 0.2 Focal Loss) that simultaneously addresses ordinal structure preservation, overconfidence reduction, and class imbalance. To improve computational efficiency, we freeze the first 6 layers (50%) of MentalRoBERTa and employ mixed-precision training. The model is evaluated using 5-fold stratified cross-validation with macro F1 score as the primary metric.
Problem

Research questions and friction points this paper is trying to address.

Classifying suicide risk levels from social media posts
Addressing class imbalance and temporal posting patterns
Preserving ordinal relationships while enabling categorical distinctions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical dual-head neural network for suicide risk classification
CORAL head preserves ordinal relationships between risk levels
Transformer encoder models temporal dependencies in posts
🔎 Similar Papers
No similar papers found.
C
Chang Yang
Department of Informatics, King’s College London, London, United Kingdom
Z
Ziyi Wang
School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, China
W
Wangfeng Tan
School of Computer Science and Technology, Anhui University, An Hui, China
Z
Zhiting Tan
School of Software, Nankai University, Tianjin, China
C
Changrui Ji
Department of Digital Humanities, King’s College London, London, United Kingdom
Zhiming Zhou
Zhiming Zhou
Shanghai University of Finance and Economics
GeneralizationOptimizationGANsMachine LearningComputer Graphics