🤖 AI Summary
This study addresses the absence of models capturing co-occurring multiple emotions and their intensity variations in Ethiopian-language social media. We propose the first joint framework for multi-label emotion recognition and intensity quantification. To enable this, we introduce fine-grained intensity annotations to the EthioEmo dataset and establish the first multi-label + intensity evaluation paradigm—filling a critical gap in Ethiopian-language affective computing. Leveraging encoder models (mBERT, XLM-R) and decoder models (Gemma, Phi-3), we conduct systematic zero-shot and fine-tuning experiments. Our approach achieves a multi-label F1 score of 0.72 and a Pearson correlation coefficient of 0.68 for intensity prediction on EthioEmo—substantially outperforming existing baselines. This work delivers the first benchmark for emotion intensity analysis in Ethiopian languages, advancing both methodological rigor and resource availability for low-resource multilingual sentiment analysis.
📝 Abstract
In this digital world, people freely express their emotions using different social media platforms. As a result, modeling and integrating emotion-understanding models are vital for various human-computer interaction tasks such as decision-making, product and customer feedback analysis, political promotions, marketing research, and social media monitoring. As users express different emotions simultaneously in a single instance, annotating emotions in a multilabel setting such as the EthioEmo (Belay et al., 2025) dataset effectively captures this dynamic. Additionally, incorporating intensity, or the degree of emotion, is crucial, as emotions can significantly differ in their expressive strength and impact. This intensity is significant for assessing whether further action is necessary in decision-making processes, especially concerning negative emotions in applications such as healthcare and mental health studies. To enhance the EthioEmo dataset, we include annotations for the intensity of each labeled emotion. Furthermore, we evaluate various state-of-the-art encoder-only Pretrained Language Models (PLMs) and decoder-only Large Language Models (LLMs) to provide comprehensive benchmarking.